Challenge 2: Creating the agentic database layer
Previous Challenge Next Challenge
| Target Persona: DBA / Database Developer | Estimated Duration: 40 minutes | Prerequisite: Challenge 1 |
Introduction
In this challenge, you will establish the agentic database layer directly on top of your AlloyDB instance. This involves configuring QueryData to guarantee predictable SQL generation for common queries (NL2SQL), and exposing advanced database capabilities (like hybrid search and semantic filtering) as SQL functions.
By preparing these operational capabilities now, you lay the foundation for the agent to interact securely and deterministically with your data.
Description
QueryData is an intelligent natural language-to-SQL (NL2SQL) engine in AlloyDB that translates conversational questions into precise, secure SQL queries. Rather than just matching static “golden queries”, it uses a Context Set (a JSON file defining your schema’s business logic) to dynamically construct queries using:
- Query Templates: Parameterized QA pairs (using
$1,$2) that allow the engine to generalize query patterns. - Query Facets: Reusable filters (e.g., mapping “highly rated” to
rating >= 4) that can be dynamically appended. - Value Search Queries: Background queries (exact, fuzzy, or semantic) that map user-typed entities to actual database values.
For more details, see the AlloyDB QueryData Documentation.
Task 2.1: Enable the AlloyDB Data API
QueryData and downstream API integration require the AlloyDB Data API (also known as the executesql API) to be explicitly enabled on your AlloyDB instance. Run the following curl command in your Cloud Shell to enable it:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://alloydb.googleapis.com/v1alpha/projects/\${GOOGLE_CLOUD_PROJECT}/locations/<YOUR_REGION>/clusters/disney-cluster/instances/disney-instance?updateMask=dataApiAccess" \
-d '{"dataApiAccess": "ENABLED"}'
(Note: Replace <YOUR_REGION> with the region where your cluster is deployed, e.g. europe-west1. The project ID is automatically resolved via the GOOGLE_CLOUD_PROJECT environment variable in Cloud Shell.)
Task 2.2: Register IAM Users in AlloyDB
To interact with the database, QueryData relies on IAM Database Authentication. This means that each user that needs to use querydata must be registered as a database user and given privileges. We will just use personal accounts for querydata.
To register your user, follow these steps:
- Log in using Basic Authentication:
- In the Google Cloud Console, navigate to AlloyDB > Clusters and select your cluster.
- Click AlloyDB Studio in the left menu.
- Sign in to the
disneydatabase using Basic Authentication (Username:postgres, Password:<your_cluster_password>).
- Register the IAM User:
- In your AlloyDB console, navigate to Users
- Click Add user account
- Select Cloud IAM
- Enter the email you use to log into the Google Cloud Console in the User ID field
- Click Save
Task 2.3: Set Up the QueryData Context Set in AlloyDB
-
Create the JSON Context Set File: Create a file named
querydata_disney_context.jsonon your local system. Fill in the SQL templates and queries according to the requirements:{ "templates": [ { "nlQuery": "Show available attractions in Disneyland Paris", "sql": "/* TODO: Write SQL to select name, description from public.disneyland_attractions filtered by branch */", "intent": "List all attractions for a specific park branch", "manifest": "List attractions by branch", "parameterized": { "parameterized_intent": "Show available attractions in $1", "parameterized_sql": "/* TODO: Write the parameterized SQL using $1 */" } }, { "nlQuery": "Find reviews with rating 5 for Space Mountain", "sql": "/* TODO: Write SQL to join reviews and attractions on branch, filtered by attraction name and rating */", "intent": "Get reviews with a specific rating for a named attraction", "manifest": "Get reviews by attraction and rating", "parameterized": { "parameterized_intent": "Find reviews with rating $2 for $1", "parameterized_sql": "/* TODO: Write the parameterized SQL using $1 (name) and $2 (rating) */" } }, { "nlQuery": "Average rating of attractions in California Adventure", "sql": "/* TODO: Write SQL to select the average rating from public.disneyland_reviews filtered by branch */", "intent": "Calculate the average review rating for a specific branch", "manifest": "Average rating by branch", "parameterized": { "parameterized_intent": "Average rating of attractions in $1", "parameterized_sql": "/* TODO: Write the parameterized SQL using $1 */" } } ], "facets": [ { "sql_snippet": "/* TODO: Write SQL snippet to filter rating >= 4 */", "intent": "highly rated reviews", "manifest": "Filter reviews by a minimum rating threshold", "parameterized": { "parameterized_intent": "reviews with rating greater than or equal to $1", "parameterized_sql_snippet": "/* TODO: Write the parameterized SQL snippet using $1 */" } } ], "value_searches": [ { "query": "SELECT DISTINCT T.name as value, 'public.disneyland_attractions.name' as columns, 'Attraction Name' as concept_type, (1.0 - ts_rank(to_tsvector('english', T.name), plainto_tsquery('english', $value))) as distance, '{}'::text as context FROM public.disneyland_attractions T WHERE to_tsvector('english', T.name) @@ plainto_tsquery('english', $value)", "concept_type": "Attraction Name", "description": "Full-text search for attraction names" }, { "query": "SELECT DISTINCT T.branch as value, 'public.disneyland_attractions.branch' as columns, 'Branch Name' as concept_type, (1.0 - ts_rank(to_tsvector('english', T.branch), plainto_tsquery('english', $value))) as distance, '{}'::text as context FROM public.disneyland_attractions T WHERE to_tsvector('english', T.branch) @@ plainto_tsquery('english', $value)", "concept_type": "Branch Name", "description": "Full-text search for park branches" } ] } -
Upload Context Set to AlloyDB (via IAM Authentication):
- In the Google Cloud Console, open AlloyDB Studio again.
- This time, sign in using IAM Authentication (this will use your logged-in Google account, which you registered in Task 2.2). You can click on the “Switch user/database” button (icon with a person and database) to switch to IAM authentication.
- In the left sidebar, check for Context sets.
- Click Create context set, name it
disney-context, and upload thequerydata_disney_context.jsonfile. - Validate the setup by using the Test context set feature with variations of your natural language templates.
Task 2.4: Expose AlloyDB AI Operators
You will prepare SQL queries that leverage AlloyDB’s advanced AI features to expose them as tools.
Tip Remember to switch back to basic authentication (logging in as the
postgresuser) in AlloyDB Studio before executing the SQL queries in this task.
-
Hybrid Search (ScaNN + FTS): Before setting up hybrid search, you need to install the required extensions in AlloyDB.
What is Hybrid Search (
ai.hybrid_search)? Hybrid search combines the strengths of semantic search (vector similarity) and keyword search (Full-Text Search) to deliver highly relevant results. While vector search excels at understanding the conceptual meaning of a query, keyword search ensures exact matches (like names or specific terms) aren’t missed. Theai.hybrid_searchfunction, provided by theaiextension, executes both searches in parallel and merges their results using Reciprocal Rank Fusion (RRF) to produce a single, unified relevance score.Run the following SQL commands in AlloyDB Studio to install the extensions and create the necessary indexes:
-- Install required extensions CREATE EXTENSION IF NOT EXISTS alloydb_scann CASCADE; CREATE EXTENSION IF NOT EXISTS rum CASCADE; -- Index RUM for Keyword FTS CREATE INDEX IF NOT EXISTS attractions_tsvector_idx ON disneyland_attractions USING RUM (description_tsvector rum_tsvector_ops); -- Index ScaNN for Vector Cosine Similarity CREATE INDEX IF NOT EXISTS attractions_vector_idx ON disneyland_attractions USING scann (embedding cosine) WITH (num_leaves=10);After creating the indexes, try running a hybrid search query using the
ai.hybrid_searchfunction to see it in action (try searching for a “thrilling space roller coaster”).This hybrid search capability will be mapped later directly to an MCP tool. Save it as a SQL file for later reference.
-
Semantic Filtering (
google_ml.if): Create a SQL function to easily evaluate if a specific attraction is suitable or safe based on a guest’s natural language profile (e.g., checking if a specific ride is safe for pregnant women or suitable for toddlers).What is Semantic Filtering (
google_ml.if)? Semantic filtering allows you to evaluate complex, natural language conditions directly within your SQL queries. Unlike vector similarity searches that rank results by closeness,google_ml.ifuses a generative AI model to evaluate a prompt against your data on a row-by-row basis, returning a simple boolean (trueorfalse). This is extremely powerful for ad-hoc safety or suitability checks where pre-defined categories or labels do not exist.Tip Before writing the reusable SQL function, it is highly recommended to build and run a standalone test query in AlloyDB Studio. This allows you to verify the behavior of
google_ml.ifon a single attraction (for example, testing if “Space Mountain” is suitable for “pregnant women”) before embedding it in the function.Create the SQL function:
CREATE OR REPLACE FUNCTION check_attraction_suitability(attraction_name TEXT, suitability_profile TEXT) RETURNS TABLE(name TEXT, description TEXT) AS $$ SELECT name, description FROM disneyland_attractions WHERE name = attraction_name AND /* TODO: Add the google_ml.if operator logic here to evaluate description against suitability_profile */; $$ LANGUAGE SQL;
Success Criteria
To validate this challenge, you must demonstrate the following:
- Show a screenshot or validation proof of the QueryData displaying a successful natural-language-to-SQL translation.
- Provide the SQL DDL definition for the
check_attraction_suitabilityfunction in AlloyDB Studio. - Provide the SQL query you used to run and test the hybrid search (ScaNN + FTS) in AlloyDB Studio, along with its results.
- Provide the SQL query showing how you called and tested the
check_attraction_suitabilityfunction in AlloyDB Studio, along with its results.