Challenge 5: Graph analytics and visitor flow

Previous Challenge Next Challenge

Target Persona: Graph Specialist / Data Analyst Estimated Duration: 45 minutes

Introduction

Understanding visitor movement patterns is key to optimizing park operations and recommending optimal paths to avoid long queues. In this challenge, you will define a Property Graph in BigQuery using BigQuery’s new native SQL Graph capabilities over the replicated visitor movement logs, query the graph for flow patterns and multi-hop journeys, and construct a routing recommendation table.

Description

Task 5.1: Build a Property Graph in BigQuery

Using BigQuery’s new native SQL Graph capabilities, you will define a property graph over the attractions and movements.

 Important Dependency Note: Creating the Property Graph requires both public_disneyland_attractions and public_visitor_movements tables to exist in BigQuery. This requires Challenge 1 (Datastream replication) to be committed first.

  1. Define the Property Graph schema.
    • Nodes (Vertices): Attractions (from the replicated public_disneyland_attractions table).
    • Edges (Relationships): Movements (from the replicated public_visitor_movements table).
  2. Write the DDL to create the property graph disney_movement_graph.

Task 5.2: Query the Graph for Patterns

Write graph queries using GRAPH_TABLE and GQL match patterns to solve the following analytical questions:

  1. Flow Analysis: What are the top 3 attractions visitors run to immediately after leaving “Space Mountain”? Write a query matching paths: (a:Attraction {name: 'Space Mountain'}) -[e:Moved]-> (b:Attraction).
  2. Multi-Hop Journeys: Find the most common 3-ride sequences (A -> B -> C) starting from “Space Mountain” taken by the same visitor within a 2-hour window. Analyze paths matching: (a:Attraction {name: 'Space Mountain'}) -[e1:Moved]-> (b:Attraction) -[e2:Moved]-> (c:Attraction) where both movements are made by the same visitor.

Task 5.3: Visitor Journey Tracking & Path Analysis

Now, let’s explore GQL’s path capabilities to analyze journeys taken by visitors.

  1. Specific Journey Tracking: Write a graph query using quantified path patterns to find how many rides visitor '11613' took to get from 'Dumbo the Flying Elephant' to 'Disneyland Railroad'.
  2. Multi-Hop Journeys by Visitor: Write a graph query to find all unique visitor IDs who traveled from 'Space Mountain' to 'Indiana Jones Adventure' through an intermediate attraction in a 2-hop journey (A -> B -> C) where both transitions are made by the same visitor.

Task 5.4: Graph-Based Recommendations

To power our intelligent guest assistant, we need to provide next-ride recommendations based on real visitor behavior.

  1. Extract Recommendations: Write a graph query to find the most recurrent next attraction visitors go to after visiting each specific attraction.
  2. Build the Recommendation Table: Save the results of this query into a new BigQuery table named graph_recommendations. This table should include the current attraction, the recommended next attraction, and a ranking score (e.g., based on frequency). This table will be synced and used later by the agent.

Success Criteria

To validate this challenge, you must demonstrate the following:

  • Show the SQL DDL statement used to define and create the Property Graph disney_movement_graph.
  • Provide the SQL graph query for the “Flow Analysis” (top 3 rides after Space Mountain) and its corresponding output.
  • Provide the SQL graph query for “Multi-Hop Journeys” starting from Space Mountain and its corresponding output.
  • Provide the SQL graph queries for “Visitor Journey Tracking” and “Reachable Journeys”, and display their outputs.
  • Verify the creation and content of the graph_recommendations table showing the most recurrent next attractions.

Previous Challenge Next Challenge