Challenge 2: Formula E-mbed

Introduction

Embeddings are high-dimensional numerical vectors representing entities like text, video or audio for machine learning models to encode semantics. These vectors help us to measure distances and find semantically similar items. If we want to be able to search within our videos, to find the most relevant one for a given question, we need to generate embeddings as a first step.

Description

Now the source data is available in BigQuery, use BigQuery ML capabilities to generate multimodal embeddings and store those embeddings in a new BigQuery table. Make sure that there’s only one embedding vector per 2 minute segment with the type ARRAY<FLOAT64>.

Success Criteria

There is a new BigQuery table with 14 rows of multimodal embeddings for the sample video files.

Learning Resources

Tips

The method for creating multimodal embeddings supports a few different arguments, pay attention to interval_seconds.

Previous Challenge Next Challenge