Challenge 7: Close the loop

Introduction

If you’ve completed all of the previous challenges, you’re now ready to bring it all together. This task is all about automating the whole process, so that when Model Monitoring raises an alert, a new model is trained and deployed.

Just like the previous challenges, if you’ve chosen the online inferencing path, continue to Online Loop, otherwise please skip to the Batch Loop section.

Note
For this challenge we’ll keep things simple, we’ll reuse the original training data to retrain and won’t do anything if the model is not better, but in real world you’d be using a combination of existing data with the new data, and take manual actions if automatic retraining doesn’t yield better results.

Online Loop

Description

Use the provided build pipeline (clouddeploy.yaml) to create a new build configuration. Configure it to be triggered in response to the messages received in the Pub/Sub topic that’s used to configure the Model Monitoring notifications. Also provide the necessary variables, such as the model training code version, endpoint name etc. Name this trigger CT-CD (or continuous-training-and-delivery).

Success Criteria

There’s a correctly configured build pipeline that can be triggered through Pub/Sub messages, named CT-CD (or continuous-training-and-delivery).
Model Monitoring alerts can trigger the mentioned build through Pub/Sub notification channel.
There’s at least one successful build.
No code was modified.

Tips

If you create the topic before you create the notification channel you can copy its fully qualified name and paste when configuring the notification channel.

Learning Resources

Triggering Cloud Build with Pub/Sub events

Batch Loop

Description

Typically Batch Predictions are asynchronous and are scheduled to run periodically (daily/weekly etc). You can trigger batch jobs using different methods, for this challenge we’ll use Cloud Build pipelines in combination with Vertex AI pipelines. Create a new Cloud Build trigger using the provided batchdeploy.yaml file, don’t forget to set the required variables. Call this trigger CD (or continuous-delivery) and make sure that this build pipeline is triggered through webhook events. Create a new Cloud Scheduler job that runs every Sunday at 3:30 and uses the webhook event URL as the execution method.

Running the batch predictions periodically will only get us half way. We need to monitor any Model Monitoring alerts and act on that. There’s another Cloud Build pipeline definition provided by clouddeploy.yaml that’s responsible for retraining. Configure that in a new Cloud Build trigger, call it CT (or continuous-training) set the required variables (remember to set ENDPOINT to [none], the others should be familiar, when in doubt have a look at the yaml file). Use Pub/Sub messages as the trigger event and pick the topic that’s configured for Model Monitoring Pub/Sub notification channel.

Success Criteria

There’s a correctly configured build pipeline for batch predictions that can be triggered with webhooks, called CD (or continuous-delivery).
There’s a Cloud Scheduler job that is configured to run every Sunday at 3.30 triggering the batch predictions build pipeline.
There’s a correctly configured build pipeline for retraining that can be triggered with Pub/Sub messages, called CT (or continuous-training).
Show that all the components have run at least once.
No code was modified.

Tips

If you create the topic before you create the notification channel you can copy its name and paste when configuring the notification channel.
The webhook URL configuration in Cloud Scheduler requires the header Content-Type to be set to application/json otherwise the things won’t work.
You can force run a Cloud Scheduler job, no need to wait until Sunday :).

Learning Resources

Previous Challenge