Disneyland Agentic Data Cloud

Disneyland Agentic Data Cloud 10-Challenge Architecture

Introduction

Welcome, Disney Data Wizards! 🪄

Planning the perfect Disneyland trip is a complex optimization problem. Visitors want to maximize magic and minimize waiting. They want to know: Which rides are best suited for them? When are the crowds thinnest? What is the optimal route through the park to avoid bottleneck queues?

In this gHack, your mission is to transform raw data—visitor reviews, attraction catalogs, historical wait times, park brochures, and visitor movement logs—into an end-to-end, intelligent guest assistance system.

This gHack is designed to be highly challenging and is structured into 10 challenges that can be parallelized across 3 key team personas to optimize development speed:

  • DB & Platform Engineers will build the operational database in AlloyDB, configure Datastream replication, set up the database agentic layer, sync analytical insights, and assemble the final agent and web UI (Challenges 1, 2, 8, 9, and 10).
  • Data Scientists & Analysts will train predictive models, perform sentiment analysis, cluster attractions in BigQuery, and design the semantic layer/Conversational Analytics agent for park managers (Challenges 3, 6, and 7).
  • AI & Graph Engineers will construct RAG pipelines, classify multimodal images, and model/query visitor movement patterns using property graphs in BigQuery (Challenges 4 and 5).

Get ready to build an agentic data pipeline that would make Mickey proud! Let the magic begin! ✨


Learning objectives

In this hack, you will build an end-to-end data pipeline with AI and database capabilities on Google Cloud:

  1. AlloyDB AI: Ingest operational data and generate vector embeddings natively.
  2. Real-time replication: Set up CDC from AlloyDB to BigQuery using Datastream.
  3. Agentic database: Configure predictable SQL generation in AlloyDB and expose it as tools.
  4. Predictive analytics: Train forecasting and sentiment models using BigQuery ML.
  5. Multimodal RAG: Build image classification and PDF search pipelines.
  6. Graph analysis: Map visitor movements and query patterns using GQL.
  7. Context layer (Knowledge Catalog): Build a centralized metadata context layer, business glossary, and profiling rules.
  8. Conversational analytics: Build a natural language assistant using BigQuery Studio.
  9. AlloyDB sync: Copy insights back to AlloyDB using FDW for fast serving.
  10. MCP tools: Expose database capabilities using the MCP Toolbox.
  11. App deployment: Build a web app with the ADK and deploy it locally.

Challenges


Prerequisites

  • Basic knowledge of Google Cloud services (AlloyDB, BigQuery, Datastream)
  • Intermediate knowledge of SQL and PostgreSQL
  • Basic familiarity with Python and Agentic AI concepts (MCP, ADK)
  • Access to a Google Cloud project with the necessary APIs and resources provisioned

Contributors

  • Matt Cornillon
  • Rayhane Rezgui

👥 Team Roles & Parallelization Paths

This gHack is designed to be highly challenging but is fully parallelizable across different team members. Whether you are a team of 2, 5, or more, you can split the challenges to build in parallel and assemble a complete intelligent system in under 4 hours.

Here is how you can divide and conquer based on your team size:

3-Group Parallelization Timeline

👥 Option 2: The 2-Group Split