Challenge 5: Graph analytics and visitor flow
Previous Challenge Next Challenge
| Target Persona: Graph Specialist / Data Analyst | Estimated Duration: 45 minutes |
Introduction
Understanding visitor movement patterns is key to optimizing park operations and recommending optimal paths to avoid long queues. In this challenge, you will define a Property Graph in BigQuery using BigQuery’s new native SQL Graph capabilities over the replicated visitor movement logs, query the graph for flow patterns and multi-hop journeys, and construct a routing recommendation table.
Description
Task 5.1: Build a Property Graph in BigQuery
Using BigQuery’s new native SQL Graph capabilities, you will define a property graph over the attractions and movements.
Important Dependency Note: Creating the Property Graph requires both
public_disneyland_attractionsandpublic_visitor_movementstables to exist in BigQuery. This requires Challenge 1 (Datastream replication) to be committed first.
- Define the Property Graph schema.
- Nodes (Vertices): Attractions (from the replicated
public_disneyland_attractionstable). - Edges (Relationships): Movements (from the replicated
public_visitor_movementstable).
- Nodes (Vertices): Attractions (from the replicated
- Write the DDL to create the property graph
disney_movement_graph.
Task 5.2: Query the Graph for Patterns
Write graph queries using GRAPH_TABLE and GQL match patterns to solve the following analytical questions:
- Flow Analysis: What are the top 3 attractions visitors run to immediately after leaving “Space Mountain”?
Write a query matching paths:
(a:Attraction {name: 'Space Mountain'}) -[e:Moved]-> (b:Attraction). - Multi-Hop Journeys: Find the most common 3-ride sequences (A -> B -> C) starting from “Space Mountain” taken by the same visitor within a 2-hour window.
Analyze paths matching:
(a:Attraction {name: 'Space Mountain'}) -[e1:Moved]-> (b:Attraction) -[e2:Moved]-> (c:Attraction)where both movements are made by the same visitor.
Task 5.3: Visitor Journey Tracking & Path Analysis
Now, let’s explore GQL’s path capabilities to analyze journeys taken by visitors.
- Specific Journey Tracking: Write a graph query using quantified path patterns to find how many rides visitor
'11613'took to get from'Dumbo the Flying Elephant'to'Disneyland Railroad'. - Multi-Hop Journeys by Visitor: Write a graph query to find all unique visitor IDs who traveled from
'Space Mountain'to'Indiana Jones Adventure'through an intermediate attraction in a 2-hop journey (A -> B -> C) where both transitions are made by the same visitor.
Task 5.4: Graph-Based Recommendations
To power our intelligent guest assistant, we need to provide next-ride recommendations based on real visitor behavior.
- Extract Recommendations: Write a graph query to find the most recurrent next attraction visitors go to after visiting each specific attraction.
- Build the Recommendation Table: Save the results of this query into a new BigQuery table named
graph_recommendations. This table should include the current attraction, the recommended next attraction, and a ranking score (e.g., based on frequency). This table will be synced and used later by the agent.
Success Criteria
To validate this challenge, you must demonstrate the following:
- Show the SQL DDL statement used to define and create the Property Graph
disney_movement_graph. - Provide the SQL graph query for the “Flow Analysis” (top 3 rides after Space Mountain) and its corresponding output.
- Provide the SQL graph query for “Multi-Hop Journeys” starting from Space Mountain and its corresponding output.
- Provide the SQL graph queries for “Visitor Journey Tracking” and “Reachable Journeys”, and display their outputs.
- Verify the creation and content of the
graph_recommendationstable showing the most recurrent next attractions.