From Preparation to Success: “A Roadmap for Passing Databricks Certified Data Engineering Associate”

Hello!! My medium family this time I am back with an article, sharing my journey from preparation to success for the certificate exam.

I am happy to share I have cleared my Databricks Certified Data Engineering Associate Exam with preparation of one month. With this article I will be sharing resources which I used for preparation and types of questions and etc.….

Week 1 preparation:

Resource 1 ==> Udemy course by Derar Alhussein

Preparation course for Databricks Data Engineer Associate certification exam (Versions 2 and 3) with hands-on training

Note: For the week 1 preparation have a look at the exam syllabus, weightage for the sections, total no of questions, total time limit for attempting the exam, exam format etc.. get all the required information for the exam.

About the Exam

● Number of items: 45 multiple-choice questions

● Time limit: 90 minutes

● Registration fee: USD 200, plus applicable taxes as required per local law

● Delivery method: Online Proctored

Week 2 preparation:

Resource 1 ==> Course by Databricks platform

Certification Overview: Databricks Certified Data Engineer Associate V2 Exam.

Resource 2 ==> Course by Databricks platform

Databricks Lakehouse Fundamentals Learning Plan — V2

Note: From the Resource1 we will get questions which already asked in exam.

Week 3 preparation:

Resource 1: Udemy practice test by Derar Alhussein

Resource 2: Udemy practice test by Akhil R

Practice Exams Databricks Certified Data Engineer Associate both for V2 and V3 exams, 5 Exams with detailed explanations

Week 4 preparation:

(i) Review your practice test, re-learn the concept for incorrect answered question, try to re-attempt practice test multiple times until you start good getting passing score.

(ii) For the last week, just review your prepared notes which you have created during learning process.

Sample Questions From Databricks

Question 1 : Describe the benefits of a data lakehouse over a traditional data warehouse. What is a benefit of a data lakehouse that is unavailable in a traditional data warehouse?

A. A data lakehouse provides a relational system of data management.

B. A data lakehouse captures snapshots of data for version control purposes.

C. A data lakehouse couples storage and compute for complete control.

D. A data lakehouse utilizes proprietary storage formats for data.

E. A data lakehouse enables both batch and streaming analytics.

Question 2 : Identify query optimization techniques A data engineering team needs to query a Delta table to extract rows that all meet the same condition. However, the team has noticed that the query is running slowly. The team has already tuned the size of the data files. Upon investigating, the team has concluded that the rows meeting the condition are sparsely located throughout each of the data files. Which optimization techniques could speed up the query?

A. Data skipping

B. Z-Ordering

C. Bin-packing

D. Write as a Parquet file

E. Tuning the file size

Question 3:

Identify data workloads that utilize a Silver table as its source. Which data workload will utilize a Silver table as its source?

A. A job that enriches data by parsing its timestamps into a human-readable format

B. A job that queries aggregated data that already feeds into a dashboard

C. A job that ingests raw data from a streaming source into the Lakehouse

D. A job that aggregates cleaned data to create standard summary statistics

E. A job that cleans data by removing malformatted records

Question 4:

Describe how to configure a refresh schedule An engineering manager uses a Databricks SQL query to monitor their team’s progress on fixes related to customer-reported bugs. The manager checks the results of the query every day, but they are manually rerunning the query each day and waiting for the results. How should the query be scheduled to ensure the results of the query are updated each day?

A. To refresh every 12 hours from the query’s page in Databricks SQL.

B. To refresh every 1 day from the query’s page in Databricks SQL.

C. To run every 12 hours from the Jobs UI.

D. To refresh every 1 day from the SQL warehouse page in Databricks SQL.

E. To refresh every 12 hours from the SQL warehouse page in Databricks SQL.

Question 5: Identify commands for granting appropriate permissions A new data engineer has started at a company. The data engineer has recently been added to the company’s Databricks workspace as The data engineer needs to be able to query the table sales in the database retail. The new data engineer already has been granted USAGE on the database retail. Which command should be used to grant the appropriate permissions to the new data engineer?







1: E 2: B 3: D 4: B 5:C

