Real-time analytics with Change Data Capture
Introduction
This gHack will take you through replicating and processing operational data from an Oracle database into Google Cloud in real time. You’ll also figure out how to forecast future demand, and how to visualize this forecast data as it arrives.
We will be using a fictitious retail store named FastFresh to help demonstrate the concepts we’ll be dealing with. FastFresh specializes in selling fresh produce, and wants to minimize food waste and optimize stock levels across all stores. You will use fictitious sales transactions from FastFresh as the operational data in this tutorial.
The above diagram showcases the flow of operational data through Google Cloud, which is as follows:
- Incoming data from an Oracle source is captured and replicated into Cloud Storage through Datastream.
- This data is processed and enriched by Dataflow templates, and is then sent to BigQuery.
- BigQuery ML is used to forecast demand for your data, which is then visualized in Looker.
Learning Objectives
- Replicate and process data from Oracle into BigQuery in real time.
- Run demand forecasting against data that has been replicated and processed from Oracle in BigQuery.
- Learn how to visualize forecasted demand and operational data in real time in Looker.
You’ll be using a variety of Google Cloud offerings to achieve this including:
- Oracle
- BigQuery
- Datastream
- Dataflow
- BigQuery ML
- Looker or Looker Studio
Challenges
- Challenge 1: Getting started
- Get yourself ready to develop our FastFresh solution
- Challenge 2: Replicating Oracle Data Using Datastream
- Backfill the Oracle FastFresh schema and replicate updates to Cloud Storage in real time.
- Challenge 3: Creating a Dataflow Job using the Datastream to BigQuery Template
- Now it’s time to create a Dataflow job which will read from GCS and update BigQuery. You will deploy the pre-built Datastream to BigQuery Dataflow streaming template to capture these changes and replicate them into BigQuery.
- Challenge 4: Building a Demand Forecast
- In this challenge you will use BigQuery ML to build a model to forecast the demand for products in store.
- Challenge 5: Visualizing the results
- In this challenge you will use your favourite visualization tool to display the predictions from the previous challenge
Prerequisites
- Access to a GCP project with
Owner
IAM role - Basic understanding of GCP
Contributors
- Murat Eken
- Carlos Augusto
- Gino Filicetti