Introduction to GenAI

Introduction

Introduction to GenAI will challenge you to build a system that catalogues scientific papers. Whenever a new paper is uploaded to a specific Cloud Storage Bucket, a Cloud Function will be triggered to do OCR, then extract the title and summary of the paper using an LLM and store all that information in BigQuery. Then we’ll run an LLM from BigQuery to classify the papers into distinct categories. Next we’ll add semantic search capabilities in BigQuery using text embeddings and finally, implement a more scalable version of that using Vector Search.

Architecture of the system

Learning Objectives

This hack will help you explore the following tasks:

  • Using Vertex AI Foundational models for text understanding
  • Prompt engineering
  • Using BigQuery to run LLMs
  • How to use text embeddings for semantic search in BigQuery
  • Vertex AI Vector Search for storing and searching text embeddings

Challenges

Prerequisites

  • Basic knowledge of GCP
  • Basic knowledge of Python
  • Basic knowledge of SQL
  • Access to a GCP environment

Contributors

  • Murat Eken