Introduction to GenAI
Introduction
Introduction to GenAI will challenge you to build a system that catalogues scientific papers. Whenever a new paper is uploaded to a specific Cloud Storage Bucket, a Cloud Function will be triggered to do OCR, then extract the title and summary of the paper using an LLM and store all that information in BigQuery. Then we’ll run an LLM from BigQuery to classify the papers into distinct categories. Next we’ll add semantic search capabilities in BigQuery using text embeddings and finally, implement a more scalable version of that using Vector Search.
Learning Objectives
This hack will help you explore the following tasks:
- Using Vertex AI Foundational models for text understanding
- Prompt engineering
- Using BigQuery to run LLMs
- How to use text embeddings for semantic search in BigQuery
- Vertex AI Vector Search for storing and searching text embeddings
Challenges
- Challenge 1: Automatic triggers
- Challenge 2: First steps into the LLM realm
- Challenge 3: Summarizing a large document using chaining
- Challenge 4: BigQuery ❤ LLMs
- Challenge 5: Simple semantic search
- Challenge 6: Vector Search for scale
Prerequisites
- Basic knowledge of GCP
- Basic knowledge of Python
- Basic knowledge of SQL
- Access to a GCP environment
Contributors
- Murat Eken