Access Controls with Unity Catalog

Databricks
16 Jan 202407:28

Summary

TLDRIn this video, Pearl uu from Databricks demonstrates Unity Catalog's central governance and auditing capabilities for data access control. The script showcases how to organize data assets, set permissions at different levels, and utilize a shared cluster for fine-grained access control. It also covers creating a feature store, training and registering a model using MLflow, and setting up row and column-level security for sensitive data, ensuring data analysts can access relevant subsets safely.

Takeaways

  • πŸ“š Unity Catalog is a central governance tool that manages and audits data access across workspaces.
  • πŸ‘₯ Data governance leaders and central governance teams have full admin capabilities to grant access to workspaces or metastores.
  • πŸ” Access can be granted to users, groups, or service principles within an organization for specific data assets.
  • πŸ“ˆ The demonstration uses a Databricks ML Workshop notebook to showcase the power of access controls within Unity Catalog.
  • πŸ—‚οΈ Catalogs are the first layer of Unity Catalog's three-level namespace, used for organizing data assets and setting permissions.
  • πŸ“ Schemas, the second layer, organize tables, views, volumes, and models, with permissioning set for teams to use them.
  • πŸ’Ύ Volumes, the third layer, contain directories and files for data stored in any format, providing non-tabular data access.
  • πŸ”‘ Demonstrated the process of granting read and write privileges on a volume to a data science team for a specific dataset.
  • πŸ› οΈ The video shows setting up a cluster with Unity Catalog for data analysis and machine learning workflows.
  • πŸ“Š The data science team creates a feature store table for training and testing models, leveraging the catalog and schema setup.
  • πŸ”’ Unity Catalog provides row and column-level security through row filters and column masks to protect sensitive data.
  • πŸ” SQL functions are used to create row filters and column masks, tailoring data access for different teams' needs.

Q & A

  • What is the role of Pearl uu in the video?

    -Pearl uu is a technical marketing engineer at Databricks, and she presents a demonstration on how Unity Catalog governs and audits data access.

  • What capabilities are granted to the data governance leader and central governance team in Unity Catalog?

    -The data governance leader and central governance team are granted full admin capabilities, allowing them to grant access to workspaces or metastores to users, groups, or service principles within the organization.

  • What is a catalog in Unity Catalog and how is it used?

    -A catalog in Unity Catalog is the first layer of the three-level namespace. It is used to organize data assets and set permissions for teams to use within the catalog.

  • What is a schema in Unity Catalog and what is its purpose?

    -A schema in Unity Catalog is the second layer of the three-level namespace. It organizes tables, views, volumes, and models, and allows for permissioning to be set at the schema level.

  • What is a volume in Unity Catalog and how does it relate to data storage?

    -A volume in Unity Catalog is part of the third layer of the namespace. It resides under a schema and contains directories and files for data stored in any format, providing non-tabular access to data.

  • What is the significance of the 'NP volume' in the demonstration?

    -The 'NP volume' is significant as it contains a dataset called 'Lending Club' that the data science team will use. It demonstrates how to grant read and write privileges on a volume.

  • What is the purpose of creating a shared cluster in Unity Catalog?

    -A shared cluster in Unity Catalog is used for most use cases where users can share resources and support fine-grained access control. It is suitable for collaborative work environments.

  • What is a feature store table and how is it utilized in the script?

    -A feature store table, such as 'loan features Test 2' in the script, is used to house features needed for model training and testing. It is created to manage and organize the features for machine learning models.

  • How does Unity Catalog facilitate the registration of models?

    -Unity Catalog allows the registration of models by referencing the same catalog and schema names used throughout the process. This ensures consistency and organization of models within the catalog.

  • What is the Catalog Explorer and how does it help in managing tables and models?

    -The Catalog Explorer is a tool within Unity Catalog that allows users to view and manage all the tables and models within a specific catalog and schema. It helps in organizing and providing access to data assets.

  • How does Unity Catalog provide row and column-level security?

    -Unity Catalog provides row and column-level security through the use of row filters and column masks. This allows for the control of access to specific rows of data and the masking of sensitive columns.

Outlines

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Mindmap

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Keywords

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Highlights

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Transcripts

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now
Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Data GovernanceUnity CatalogAccess ControlTechnical MarketingData ScienceML WorkshopData SecurityCatalog SchemaFeature StoreModel TrainingData Analysis