Challenge 3: Automation

Previous Challenge Next Challenge

Introduction

As all of the historical data has now been copied over to BigQuery, we can optimize the Spanner environment by deleting the data that is more than a year old from the Spanner database, as that is only needed for analytics. However Spanner has some limits on the number of mutations that can be done in a single transaction, so we’ll have to batch the removal of the data and script it. There’s a plethora of different tools but we’ll use Application Integration for that purpose.

Application Integration is an Integration-Platform-as-a-Service (iPaaS) solution in Google Cloud that offers a comprehensive set of core integration tools to connect and manage the multitude of applications (Google Cloud services and third-party SaaS) and data required to support various business operations.

Description

We’ll use the integrationcli tool to configure the pipeline for removing the historical data from Spanner. Go ahead and follow the instructions to install it.

We’ve already prepared the Application Integration pipeline for you here, download that to the VM where you installed the integrationcli and publish it. Use the same region as the Spanner instance and make sure to set the environment name for the scaffolding to dev and grant permissions.

Once the Application Integration is available in the Google Cloud Console, open it and run it by clicking on the Test button, and choosing the Delete Order Items task (keep the default parameters).

 Note
The pipeline will delete the rows in multiple batches asynchronously, give it a few minutes before verifying that all the historical data is removed.

Success Criteria

  • The Application Integration pipeline has been successfully run.
  • All the historical data (anything older than 2024-01-31) has been removed from the Spanner database.

Learning Resources

Tips

  • In principle you could install the integrationcli tool on any VM, but the Cloud Shell is the easiest option.
  • If you’ve installed the tool on Cloud Shell, you can use the --default-token option to authenticate.

Previous Challenge Next Challenge