Challenge 6: AI with multi-modal analysis
Introduction
In a modern Lakehouse, data isn’t just rows and columns; it’s also unstructured data like images, PDFs, and audio. Google Cloud enables “multi-modality” by allowing BigQuery to manage and analyze unstructured data through Object Tables. By using BigQuery AI and pre-trained models (like Gemini or specialized vision models), we can extract insights from images—such as identifying defects in returned items—and join those insights directly with our Iceberg tables.
Description
A batch of images representing returned products has been uploaded to your Google Cloud Storage bucket in the /return_images/ folder. Each filename corresponds to a product ID. Your goal is to analyze these images and extract a description of the faulty product and correlate this information with our product catalog.
- Create an Object Table: Create an external Object Table in BigQuery that points to the images in your Cloud Storage bucket. You can use the same BigLake connection created in Challenge 1.
- Join product and image data: Create a new table that brings the structured product data (backed by Iceberg) and the image metadata (Object table) together in a single table that contains a column using the ObjectRef data type.
- Perform Multi-modal Analysis: Write a SQL query that uses the
AI.GENERATEfunction to describe the condition of the products in the return images.
Success Criteria
- An Object Table is successfully created and lists the URIs of the return images.
- A single table exists combining product and image metadata using the ObjectRef data type.
- A SQL query successfully processes the images and returns a text description of the condition of the returned item.