Nidhi Gupta
3 min readMar 2, 2024

--

Data Management in Databricks: Data Object Hierarchy

Data object hierarchy in Databricks refers to how data is organized and accessed in the Lakehouse platform. The Databricks Lakehouse organizes data stored with Delta Lake using familiar relations such as databases, tables, and views, which combines the benefits of a data warehouse and a data lake.

The primary data objects in the Databricks Lakehouse are megastores, databases/schemas, tables, views, functions, and volumes.

Metastores(HMS):

1) The central repository is one of the important components for storing metadata about the data objects in the lakehouse. The Hive Metastore, which is named the central repository, is one of the important components.

2) There are two types of Hive Metastores in Azure Databricks: Unity catalog metastore and built-in hive metastore.

Unity catalog metastore vs Built-in Hive metastore:

The built-in hive metastore only supports a single catalog called, hive_metastore.

Catalogs:

  1. Catalogs exist as objects within a metastore.
  2. Every database will be associated…

--

--

Nidhi Gupta

Azure Data Engineer 👨‍💻.Heading towards cloud technologies expertise✌️.