Challenge 4: Fine-grained access control

Previous Challenge Next Challenge

Introduction

In a modern Lakehouse, flexibility must be balanced with rigorous security. As we consolidate data into Apache Iceberg, protecting Personally Identifiable Information (PII) becomes a top priority.

Using BigLake and Data Catalog, you can enforce “Zero Trust” security directly on your Iceberg tables. By applying Data Masking and Column-Level Security, you ensure that sensitive fields like email addresses or pricing data are obscured by default. This is a foundational step for Responsible AI, preventing sensitive data from leaking into LLM prompts or being used improperly during model training.

Description

Create a new taxonomy called “Sensitive Data” in the us-central1 region with 2 data policies:

  • Confidential: using the default masking rule.
  • Email: using the email masking rule.

Apply the newly created data policies to the tables:

  • Attach the Confidential data policy to products.retail_price.
  • Attach the Email data policy to users.email.
  • Run a SQL query to select these columns. Are you able to read the data?

Assign the appropriate permissions to read the columns.

  • Add the Masked Reader role to your user and check if you can see the masked value.
  • Add the Fine-Grained Reader role to your user, and check if you can see the original value.

Success Criteria

  • Column level security is applied correctly on the products and users tables.
  • You are able to read the masked and unmasked data when using the appropriate roles.

Learning Resources

Previous Challenge Next Challenge