Challenge 2: Replicating Oracle Data Using Datastream

Previous Challenge Next Challenge

Introduction

Datastream supports the synchronization of data to Google Cloud databases and storage solutions from sources such as MySQL and Oracle.

In this challenge, we’ll configure Datastream to load the Oracle FastFresh schema and replicate updates from the Oracle database to Cloud Storage in real time.

Note Keep in mind that it is possible to directly stream data from Oracle to BigQuery with Datastream and do transformations within BigQuery. However, for the sake of including Dataflow jobs in the next challenges we’ll stage the data in Cloud Storage first in this challenge and transform it with Dataflow before loading it in BigQuery in the next one.

Description

Configure Datastream to replicate data from the ORDERS table in the Oracle database into the bucket created in the previous challenge in JSON format using the datastream user, validate it, but don’t start it yet. Make sure to include existing records in the stream as well.

Note We have fulfilled the Oracle source and Cloud Storage destination prerequisites during setup, so you can ignore that section (and ignore the potential validation error for Cloud Storage permissions at the end).

Success Criteria

  1. You’ve created a new Datastream stream
  2. The stream is setup to replicate the ORDERS table into the bucket in JSON format

Tips

  • The IP Allowlisting option is the easiest method for the connectivity, however don’t forget to update the firewall rules :)

Learning Resources

Previous Challenge Next Challenge