Tutorial: Create a Disposable SQL Server Database Container in an Azure DevOps Pipeline With Spawn

on July 13, 2021

This post is part of TSQLTuesday #140, “What have you been up to with containers?"

Disposable databases are indispensable for software developers, both for initial development work and in automation pipelines. Containers pair exceptionally well with temporary databases and provide a lightweight mechanism to spin up an environment on demand.

In this post, I share why we need disposable databases, use cases for disposable databases in automated workflows, and a tutorial to create an Azure DevOps pipeline to spin up an AdventureWorks2019 sample data container on demand using Spawn. Spawn has a free tier which you can try out anytime.

Why we need disposable databases

There are two main reasons why we need disposable databases for software development:

Testing database changes requires a realistic dataset. Schema isn’t enough! Will that unique index create successfully in production? Can you add a non-nullable column to that table without a default value? How will this query perform? We need realistic datasets for these scenarios and more.
Databases generally don’t have good ways to “reset” after you make changes. I say “good” here because some techniques do exist. SQL Server, for example, has a database snapshot feature. One can create a database snapshot at a point in time, make some changes, and later revert to that snapshot. But this feature doesn’t scale for software development. Database snapshots aren’t portable– they are tightly coupled with the source database. Database snapshots are read-only for a point in time– they are not writeable. The database snapshot feature is also cumbersome to implement and wasn’t designed with automation in mind.

Without disposable databases, it is much harder to do quality testing of changes. It’s also much harder to validate deployments when reordering changes in release pipelines.

A containerized service for ephemeral database containers

Redgate’s Foundry team created Spawn to explore how a containerized service can solve the problems above. The Spawn cloud service lets users:

Create data images. Your image can be empty, you can create it from scripts, or you can create it from database backups.
Create data containers from images. These containers can be quickly created, used, and then removed.

These are just the basics. There is a lot more functionality available, including resetting containers and graduating containers to new images.

Two use cases for disposable databases in pipelines

Continuous Integration builds

When your repository contains database code, it’s essential to ensure the code is valid – just like any other code. Disposable databases allow you to automate code validation for your database code whenever someone commits or merges changes into a code branch. Continuous Integration can call a pipeline or action which will:

Create a data container including the database
Deploy changes from the repo to the database
Fail the action or pipeline if the code doesn’t deploy successfully
Clean up the data container at the end of the job

Pull requests

One example of using a disposable database in an automation pipeline is in a Pull Request (PR) workflow when using Git. In this workflow, a developer will:

Create a local Git branch
Make database changes locally to their environment (maybe a disposable database which they spin up just for the branch)
Save and commit their changes to Git locally using the team’s preferred tooling for the project
Push their branch up to the central repo
Create a Pull Request on the central repo to merge the code into a shared branch (commonly “main”)

There is often an option for automation to kick off at this point to validate the code in your branch. Effective automation will save work for the folks reviewing your Pull Request while enabling them to do high quality review of your changes. If you have a database cloning solution, this automation can:

Create a data container including the database
Deploy changes from the repo to the database

Optionally, you can keep the database around for the reviewers to inspect as part of the PR review. Many teams prefer to build an entire environment (database, application components, etc.) for use in PR reviews.

Prerequisites to follow this tutorial

All of the accounts, services, and tools used in this tutorial offer free tiers. To follow along, you need:

A GitHub account to connect to Spawn.
An Azure DevOps project - the free tier works fine. The project repo can be empty (no database code is required).
To install the Spawn command line on a computer. This has a few simple steps: log in to the Spawn web app with GitHub, download the Spawn command line executable to a local folder, add the folder to your path, and then run spawnctl auth in a terminal.

Note: We won’t deploy database code to the data container in this tutorial itself, but we’ll set the foundation to do that in a future tutorial.

Create an Azure DevOps pipeline

In this example, I am working with a newly created Azure DevOps project. I’ve initialized the default Git repo with a readme, but I haven’t added any code.

First, click Pipelines in the left menu bar, then click New pipeline at the top right.

Azure DevOps pipelines screen with the New pipeline button

At the Where is your code? prompt, click Azure Repos Git. This is a YAML pipeline. (You could also do this process with code stored in Bitbucket Cloud or GitHub.)

Azure DevOps prompt to select repo location

At the Select a repository prompt, click on the name of the repo.

Azure DevOps prompt to select a repository

Troubleshooting: If you do not see a repo name here and this is a new Azure DevOps project, click on Repos in the left bar. If it says “[YourRepoName] is empty. Add some code!” at the top, one way to get it working fast is to scroll to the bottom of the page and click Initialize with the “Add a README” option ticked.

At the Configure your pipeline prompt, select Starter pipeline.

Azure DevOps prompt to create a starter pipeline or choose existing YAML file

This should bring you to the Review your pipeline YAML screen, with simple starter code in a YAML pipeline.

Add YAML to the pipeline to create a Spawn data container

Our next step is to replace that starter pipeline code with the following YAML. This YAML will…

Use an Azure-hosted agent running ubuntu-latest
Download and install the Spawn command line (spawnctl) to the agent
Create a data container based on the image name specified in a variable named $(DATA_IMAGE_NAME) with a lifetime of 20 minutes
Authenticate to Spawn by using an access token stored in a variable named $(SPAWNCTL_ACCESS_TOKEN)

trigger:
- main

pool:
  vmImage: ubuntu-latest

# variable: $(SPAWNCTL_ACCESS_TOKEN) - https://docs.spawn.cc/commands/spawnctl-accesstoken-get
# Variable:  $(DATA_IMAGE_NAME) - to test quickly use a public image https://docs.spawn.cc/other/public-data-images 

steps:
  - script: |
      echo "Downloading and installing spawnctl..."
      curl -sL https://run.spawn.cc/install | sh      
    displayName: Install spawnctl
  - script: |
      set -e
      export PATH=$PATH:$HOME/.spawnctl/bin
      dataContainer=$(spawnctl create data-container --image $(DATA_IMAGE_NAME) --lifetime 20m -q )
      echo "##vso[task.setvariable variable=dataContainerName]$dataContainer"
      dataContainerJson=$(spawnctl get data-container $dataContainer -o json)
      port=$(echo $dataContainerJson | jq -r .port)
      host=$(echo $dataContainerJson | jq -r .host)
      user=$(echo $dataContainerJson | jq -r .user)
      password=$(echo $dataContainerJson | jq -r .password)      
    displayName: spawn a database
    env:
      SPAWNCTL_ACCESS_TOKEN: $(SPAWNCTL_ACCESS_TOKEN)

Add variables to your pipeline

On the Review your pipeline YAML screen, click Variables in the top right corner.

Azure DevOps review your pipeline YAML screen with Variables button

In the Variables pane, click New Variable

Add a new variable:

Set the variable name to DATA_IMAGE_NAME
Set the value to adventure-works:2019 – or use the public image of your choice.

Click OK at the bottom of the pane.

Azure DevOps pane when adding a new variable

In the Variables pane, click the plus sign to add a second variable.

Azure DevOps input pane for variable name and value

To get the value for this variable, you need to:

Install the Spawn command line on a computer, if you have not done so already. (This should be fast: the page guides you to log into the Spawn web app with GitHub, download the Spawn command line executable, save it to a folder, add the folder to your path, and then run spawnctl auth in a terminal.)
Run the following command to create an access token:

spawnctl create access-token --purpose "Spawn access token for pipeline test"

Copy the token from the terminal to your clipboard.

Now, back in the Azure DevOps variables pane, add a second variable:

Set the variable name to SPAWNCTL_ACCESS_TOKEN
Past the value of your access token from your clipboard
Tick off the “keep this value secret” box to protect your access token

Click OK at the bottom of the pane.

Azure DevOps variables pane to add second variable.

Back in the main Variables pane, click Save at the bottom of the screen.

Azure DevOps variables pane with final save button for variable additions

Save and run your pipeline

As a quick recap, so far we have:

Created a starter YAML pipeline
Replaced the code with YAML to work with Spawn
Added two variables to the pipeline and saved them

Now we need to save the pipeline itself. On the Review your pipeline YAML screen, click Save and run.

Azure DevOps save and run button for the pipeline

In the Save and run pane which appears, click Save and run in the bottom right corner.

Azure DevOps confirmation that you really want to save and run

You will now be taken to a screen where you can watch your Pipeline job be queued and then run. If all has gone well, you should see your job succeed in a relatively short amount of time.

Azure DevOps screen to review the pipeline progress and status

Check out the data container created by the Pipeline

Our pipeline asked Spawn to create a copy of the AdventureWorks sample database with a lifetime of 20 minutes. We can connect to that database in SQL Server Management Studio or Azure Data Studio if we’d like to check it out.

We let Spawn name the database container in this case.

Open a terminal, and run the following command to get a list of all your data containers.

spawnctl get data-containers

This will return a list of your containers with their metadata.

spawn command line listing data containers created

Next, plug the name of your data container into the following command.

spawnctl get data-container adventure-works-ebnmatep -o json

This will output connection information for your database.

spawn command line outputting json with full connection details

Note from the output:

The Data Source connection string, including the port
The username is sa
A random password has been generated

You can use this information to connect to the SQL Server instance using the IDE of your choice.

Instance connection screen in Azure Data Studio with details on spawn instance

If you enjoy using Azure Data Studio, the Spawn extension for Azure Data Studio makes connecting to your Spawn instances simple.

Other ways to get started with Spawn

If you’d like to play around with Spawn locally, follow the Spawn Getting Started tutorial.

If you’d prefer to use Spawn with GitHub Actions, there’s a tutorial for that as well.

Thanks for the great TSQLTuesday topic

I’ve wanted to write this article for a while, but I kept not getting around to it. Thanks to Anthony Nocentino for the writing prompt.