Challenge 4: Generating text and embeddings

Previous Challenge Next Challenge

Introduction

In this challenge we’ll create enhanced product descriptions and text embeddings for the products table in BigQuery to prepare for semantic search.

Description

Add the following two columns product_description (STRING) and product_description_embeddings (ARRAY<FLOAT64>) to the products table in BigQuery. Using an LLM from BigQuery, generate product descriptions based on the product name, brand, category, department and retail_price information for at least 100 products and store that in the new product_descriptions column.

 Note
We’re only generating the descriptions for 100 products, as doing it for the complete dataset would take too long.

Then using an embeddings model again from BigQuery, generate embeddings for the product_description column (for the 100 product descriptions that have been generated) and store it in the new product_description_embeddings column.

Success Criteria

  • There are two new columns in the BigQuery products table: product_descriptions and product_description_embeddings.
  • The column product_description contains the LLM generated product descriptions for at least 100 products.
  • The column product_description_embbedings contains the embeddings for the product descriptions.

Learning Resources

Previous Challenge Next Challenge