Azure dev ops

Schedule Standby Azure DevOps scale set agents

Schedule Standby Azure DevOps scale set agents

I’ve been running Azure DevOps scale set agents for a while now. One of the features that I find lacking is the ability to schedule how many standby agents you have. Standby agents are build agent instances that are always running. They don’t time out because of inactivity like normal agents, which means that your build will start quickly, rather than have to wait for an instance to deploy.

Ideally, you want at least 1 during working hours, but none out of hours. However, at the moment, you can only set how many you have, which means you are paying for these instances when you’re not using them.

There was a mention of adding this scheduling functionality to DevOps on the official 2020 Q3 road map, but this seems to be missing from the latest 2021 Q1 roadmap, which is a shame. So I have put together a simple Azure function to provide this.

Creating the REST request

When I want to automate DevOps, I start by watching the API traffic when the task is run manually. I use the new Edge browser, but the steps should be similar when using other browsers.

  1. Open your browser and navigate to Settings > Agent pools.

  2. Click on the Settings option at the top of the screen

  3. Press F12 to open the developer tools and Network tab at the top, then update the value of standby agents and hit Save, to send the request.

Agent pool settings screen
  1. We are interested in the Request Url and the Request Method, which can be seen on the Headers sub-tab (Is a sub-tab a thing?).
Devloper Tools
  1. The end of the Request Url is the agent pool ID. This ID is how DevOps refers to the different pools and can be found in URL bar on any page in DevOps that relates to your pool. For example, the agent pool settings screen.
Agent pool settings screen
  1. The next information we need is on the Preview sub-tab. This is the body of the REST request.
Agent pool settings screen

With the information we have gathered, our request looks like this. Notice I’ve added ?api-version=6.1-preview.1 to the end of the url.

Request URL: https://dev.azure.com/YourOrganisationName/_apis/distributedtask/elasticpools/YourPoolId?api-version=6.1-preview.1

Method: PATCH

Body:

    {
    "recycleAfterEachUse": false,
    "maxSavedNodeCount": 0,
    "maxCapacity":<maxCapacity>,
    "desiredIdle":<desiredIdle>,
    "timeToLiveMinutes": 15,
    "agentInteractiveUI": false
    }

If you want to test the request on its own, you could use Postman to send the request to DevOps. This helps you confirm the request works before you send it with PowerShell. It can be a right pain, trying to work out if the REST or PowerShell is wrong.

I won’t go into how to use Postman, but your request should look like this

Authorization

You need to create a devops PAT that has Read & Manage permissions to your build agent pools. You enter this here, along with the email address of the account that created the pat.

Postman Autorization screen

Body

Paste in the json body from above making sure to set the values (integers) for maxCapcity and desiredIdle. You will need to set the radio button to raw and then select JSON from the dropdown

Postman Request Body

Hit Send and you should see the below output. If the status does not return 200 OK, double check your values. The messages are usually helpful, but if you can’t work it out, let me know.

Postman Return screen

Now that we have a working request, we need to create the PowerShell function to run it.

Creating the function using VSCode

Microsoft have some great documentation on how to create Azure PowerShell functions, both using VSCode and using the command line. I don’t think it really matters, how you want to do it, but I used VSCode.

  1. Create a new folder/Git Repo on your machine and then open this with VSCode.

  2. Follow your preferred Microsoft article and make a function with the following attributes

    • Language: PowerShell
    • Template: Timer Trigger
    • Cron Timer: 0 */15 * * * * ( You can change this to another CRON schedule)
  3. You should end up with something that looks like this

Agent pool settings screen Agent pool settings screen
  1. Other than the CRON schedule you shouldn’t need to change anything in the function configuration, but we do need to replace the run.ps1 with the below.

When I originally wrote this code, it was a quick fix to an issue I had. It’s what one of my colleagues would call “a thing of beauty” 😂. My assumption was that Microsoft would complete their feature within a few weeks and then I could turn this off. It works fine, but feel free to refactor it.

NOTE: The code window can make this hard to read, so i have linked to a github repo for the code at the end of the post

# Input bindings are passed in via param block.
param($Timer)

# Get the current universal time in the default string format
$currentUTCtime = (Get-Date).ToUniversalTime()

############ functions ############

function Set-ScaleSet {
    [CmdletBinding(SupportsShouldProcess)]
    Param
    (

        [Parameter(
            Mandatory = $true,
            HelpMessage = "Name of the target scale set"
        )]
        [ValidateNotNullOrEmpty()]
        [String]
        $targetScaleSetId,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "devops Username"
        )]
        [ValidateNotNullOrEmpty()]
        [String]
        $username,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "devops PAT"
        )]
        [ValidateNotNullOrEmpty()]
        [String]
        $PAT,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "Name of the target scale set"
        )]
        [ValidateNotNullOrEmpty()]
        [int]
        $desiredIdleCount,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "Maximum number of instances"
        )]
        [ValidateNotNullOrEmpty()]
        [int]
        $maxCapacity,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "Recycle instances after each job"
        )]
        [ValidateNotNullOrEmpty()]
        [bool]
        $recycleAfterEachUse,

        [Parameter(
            Mandatory = $true,
            HelpMessage = "devops organisation name"
        )]
        [ValidateNotNullOrEmpty()]
        [String]
        $orgName
    )

    Begin {
    }
    Process {
        If ($PSCmdlet.ShouldProcess("Create new SQL Token and publish as pipeline variable")) {
            $DevOpsCreds = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $username, $PAT)))

            $BodySource = @"
    {
        "recycleAfterEachUse":false,"maxSavedNodeCount":0,"maxCapacity":<<maxCapacity>>,"desiredIdle":<<desiredIdle>>,"timeToLiveMinutes":15,"agentInteractiveUI":false
    }
"@
            $jsonBody = $BodySource.replace("<<desiredIdle>>", $desiredIdleCount).replace("<<maxCapacity>>", $maxCapacity).replace("<<recycleAfterEachUse>>", $recycleAfterEachUse)

            Write-Output "jsonBody:"
            $jsonBody

            $setElasticPoolsUri = "https://dev.azure.com/$orgName/_apis/distributedtask/elasticpools/" + $targetScaleSetId + "?api-version=6.1-preview.1"

            Write-Output "setElasticPoolsUri: $($setElasticPoolsUri)"

            Write-Output "Set Scale Set Response for: $targetScaleSetId"

            Invoke-RestMethod -Uri $setElasticPoolsUri -Method Patch -Body $jsonBody -ContentType "application/json" -Headers @{Authorization = ("Basic {0}" -f $DevOpsCreds) }

        } # End $PSCmdlet.ShouldProcess
    }
    End {

    }
}
####################################


# The 'IsPastDue' porperty is 'true' when the current function invocation is later than scheduled.
if ($Timer.IsPastDue) {
    Write-Host "PowerShell timer is running late!"
}

Write-Host "PowerShell timer trigger function Started at UTC TIME: $currentUTCtime"

$username = $env:azdoUser
$PAT = $env:azdoPAT
$targetScaleSets = $env:azdoScaleSets
$desiredIdle = $env:desiredIdle
$powerOnHour = $env:PowerOnHour
$powerOffHour = $env:PowerOffHour
$recycleAfterEachUse = $env:recycleAfterEachUse
$maxCapacity = $env:maxCapacity
$orgName = $env:orgName


$DevOpsCreds = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $username, $PAT)))
$d = Get-Date

$getElasticPoolsUri = "https://dev.azure.com/$orgName/_apis/distributedtask/elasticpools?api-version=6.1-preview.1"
Write-Output "getElasticPoolsUri: $($getElasticPoolsUri)"

$getElasticPoolsUriResponse = (Invoke-RestMethod -Uri $getElasticPoolsUri -Method Get -ContentType "application/json" -Headers @{Authorization = ("Basic {0}" -f $DevOpsCreds) }).value

if ($getElasticPoolsUriResponse) {

    if ((($d.DayOfWeek -notmatch "Sat") -or ($d.DayOfWeek -notmatch "Sun")) -and (($d.Hour -ge $powerOnHour ) -and ($d.Hour -lt $powerOffHour ))) {
        Write-Information -InformationAction Continue -MessageData "Weekday between 7am and 7pm"


        foreach ($item in $getElasticPoolsUriResponse) {
            $scalesetName = ($item.azureId).split("/")[8]
            $poolID = $item.poolid
            Write-Output "Main Script Pool ID: $($poolID)"
            Write-Output "Main Script desired Idle: $($desiredIdle)"

            Write-Information -InformationAction Continue -MessageData "Name: $scalesetName"

            if ($targetScaleSets -match $scalesetName) {
                Write-Information -InformationAction Continue -MessageData "Found scaleset: $scalesetName"

                Set-ScaleSet -targetScaleSetId $poolID -desiredIdleCount $desiredIdle -username $username -PAT $pat -maxCapacity 4 -recycleAfterEachUse $false -orgName $orgName


            }
            else {
                Write-Information -InformationAction Continue -MessageData "Did not find scaleset: $scalesetName"
            }
        } # End of foreach ($item in $getElasticPoolsUriResponse){


    } # End of if (($d.DayOfWeek -ne "Saturday") -or ($d.DayOfWeek -ne "Sunday")) {
    elseif ((($d.DayOfWeek -notmatch "Sat") -or ($d.DayOfWeek -notmatch "Sun")) -and (($d.Hour -lt $powerOnHour ) -and ($d.Hour -gt $powerOffHour ))) {
        Write-Information -InformationAction Continue -MessageData "Weekday between 7pm and 7am"


        foreach ($item in $getElasticPoolsUriResponse) {
            $scalesetName = ($item.azureId).split("/")[8]
            $poolID = $item.poolid

            Write-Information -InformationAction Continue -MessageData "Name: $scalesetName"

            if ($targetScaleSets -match $scalesetName) {
                Write-Information -InformationAction Continue -MessageData "Found scaleset: $scalesetName"


                Set-ScaleSet -targetScaleSetId $poolID -desiredIdleCount 0 -username $username -PAT $pat -maxCapacity 4 -recycleAfterEachUse $false -orgName $orgName
            }
            else {
                Write-Information -InformationAction Continue -MessageData "Did not find scaleset: $scalesetName"
            }
        } # End of foreach ($item in $getElasticPoolsUriResponse){


    } # End of elseif (($d.DayOfWeek -ne "Saturday") -or ($d.DayOfWeek -ne "Sunday"))
    else {
        Write-Information -InformationAction Continue -MessageData "Weekend"


        foreach ($item in $getElasticPoolsUriResponse) {
            $scalesetName = ($item.azureId).split("/")[8]
            $poolID = $item.poolid

            Write-Information -InformationAction Continue -MessageData "Name: $scalesetName"

            if ($targetScaleSets -match $scalesetName) {
                Write-Information -InformationAction Continue -MessageData "Found scaleset: $scalesetName"


                Set-ScaleSet -targetScaleSetId $poolID -desiredIdleCount 0 -username $username -PAT $pat  -maxCapacity 4 -recycleAfterEachUse $false -orgName $orgName
            }
            else {
                Write-Information -InformationAction Continue -MessageData "Did not find scaleset: $scalesetName"
            }
        } # End of foreach ($item in $getElasticPoolsUriResponse){
    }
} # End of if($getElasticPoolsUriResponse){
else {
    Write-Error "No response from get elastic pool request"
}

# Write an information log with the current time.
Write-Host "PowerShell timer trigger function finished at TIME: $currentUTCtime"

You’ll notice a variable section mapping local vars to environment variables. This made it easier to reference some of the variables when I concatenate strings.

These variables are initially stored in the local.settings.json, later they will be added to the function configuration.

Create the local.settings.json file in the root of the function project. The contents should match the below, however you will need to substitute the <> for real values. The file should be excluded from git automatically, but its worth checking if it appears in any of your commits. If it does, right click on it and select add to .gitingore

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "FUNCTIONS_WORKER_RUNTIME_VERSION": "~7",
    "FUNCTIONS_WORKER_RUNTIME": "powershell",
    "azdoPAT":"<pat>",
    "azdoUser": "<emailAddress>",
    "azdoScaleSets": "<scaleSet1,scaleSet2>",
    "desiredIdle": 1,
    "PowerOnHour": 7,
    "PowerOffHour": 19,
    "maxCapacity": 4,
    "recycleAfterEachUse": false,
    "orgName":"<yourDevOpsOrgName"
  }
}

We have already discussed obtaining the values for azdoPat and azdoUSer. You can reuse the pat you created earlier (or any existing pat with the correct rights).

azdoScaleSets is a comma separated list of the scale sets you wish to manage (make sure this is the scale set names and not the agent pool names)

desiredIdle is the number of standby agents you would like (1 seems to work quite well with a small team)

PowerOnHour is the hour you wish the standby instances to be available (UTC 24Hr). I have set this for 7am

PowerOffHour is the hour you wish the standby instances to stop being available (UTC 24Hr). I have set this for 7pm

I added the next few settings because they were available in the API.

maxCapacity is the maximum number of instances you want to deploy. So you won’t be able to create more than 4 instances in my example. It’s a simple form of cost control.

recycleAfterEachUse controls if the instances are marked for deletion after each build has run. This feature should be awesome, but it takes quite a while for these instances to tear down. So I have left this as false for now.

orgName is the name of your Azure DevOps organisation. So if your url is https://dev.azure.com/orange, your organisation name is orange.

When the local settings are in place, you will be able to test the function locally. This requires its own prereqs, but this is covered in Microsoft’s documentation.

At first, testing a scheduled function doesn’t seem as straight forward as a http function, but the VSCode tooling is quite clever. You can right click on your function and select Execute Function Now…

execute function in VSCode

Deploy the function to Azure

Once your local function is working, you can use VSCode to deploy it to Azure. Microsoft make it easy, they have created an excellent walkthrough for the deployment wizard. It may take a few minutes, but you will end up with a function that looks a bit like this.

execute function in VSCode

We now need to add the configuration values we had in the local.settings.json.

  1. Browse to Settings > Configuration in the left hand menu and start adding settings using the New application setting button.
execute function in VSCode
  1. Once all settings have been added, hit Save.

Confirm the function is working in Azure

  1. Browse to Functions > Functions and click on your function.
execute function in VSCode
  1. Click on Code + test and then the Test/Run at the top
execute function in VSCode
  1. Click Run on the dialog
execute function in VSCode
  1. If you drag up the top of the bottom window you should see a successful response. If not, post in the comments, and I’ll try and help
execute function in VSCode
  1. Double check your agent pool settings to see that values have been changed, if so you’re done.

There are a few moving parts to this solution, so I have put the example code in a github repo to make it a little easier. If there is enough interest, I’ll look to add an example DevOps pipeline to deploy this as well. Hopefully, Microsoft will release the scheduling feature soon and this can all be torn down.