Pass the Databricks Databricks-Certified-Data-Engineer-Associate Questions and answers with ValidTests

Exam Databricks-Certified-Data-Engineer-Associate All Questions

Exam Databricks-Certified-Data-Engineer-Associate Premium Access

View all detail and faqs for the Databricks-Certified-Data-Engineer-Associate exam

Go to Exam

Viewing page 5 out of 6 pages

Viewing questions 41-50 out of questions

Questions # 41:

A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.

Which of the following relational objects should the data engineer create?

Options:

Spark SQL Table

View

Database

Temporary view

Delta Table

Expert Solution

Questions # 42:

A data engineer is maintaining a data pipeline. Upon data ingestion, the data engineer notices that the source data is starting to have a lower level of quality. The data engineer would like to automate the process of monitoring the quality level.

Which of the following tools can the data engineer use to solve this problem?

Options:

Unity Catalog

Data Explorer

Delta Lake

Delta Live Tables

Auto Loader

Expert Solution

Questions # 43:

A data engineer has been using a Databricks SQL dashboard to monitor the cleanliness of the input data to an ELT job. The ELT job has its Databricks SQL query that returns the number of input records containing unexpected NULL values. The data engineer wants their entire team to be notified via a messaging webhook whenever this value reaches 100.

Which of the following approaches can the data engineer use to notify their entire team via a messaging webhook whenever the number of NULL values reaches 100?

Options:

They can set up an Alert with a custom template.

They can set up an Alert with a new email alert destination.

They can set up an Alert with a new webhook alert destination.

They can set up an Alert with one-time notifications.

They can set up an Alert without notifications.

Expert Solution

Questions # 44:

What is stored in a Databricks customer's cloud account?

Options:

Data

Cluster management metadata

Databricks web application

Notebooks

Expert Solution

Questions # 45:

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.

Question # 45

The code block used by the data engineer is below:

Which line of code should the data engineer use to fill in the blank if the data engineer only wants the query to execute a micro-batch to process data every 5 seconds?

Options:

trigger("5 seconds")

trigger(continuous="5 seconds")

trigger(once="5 seconds")

trigger(processingTime="5 seconds")

Expert Solution

Questions # 46:

A data engineer is working with two tables. Each of these tables is displayed below in its entirety.

Question # 46

The data engineer runs the following query to join these tables together:

Question # 46

Which of the following will be returned by the above query?

Question # 46

Options:

Option A

Option B

Option C

Option D

Option E

Expert Solution

Questions # 47:

A data engineer is designing a data pipeline. The source system generates files in a shared directory that is also used by other processes. As a result, the files should be kept as is and will accumulate in the directory. The data engineer needs to identify which files are new since the previous run in the pipeline, and set up the pipeline to only ingest those new files with each run.

Which of the following tools can the data engineer use to solve this problem?

Options:

Unity Catalog

Delta Lake

Databricks SQL

Data Explorer

Auto Loader

Expert Solution

Questions # 48:

Which of the following is hosted completely in the control plane of the classic Databricks architecture?

Options:

Worker node

JDBC data source

Databricks web application

Databricks Filesystem

Driver node

Expert Solution

Answer

Explanation

The Databricks web application is the user interface that allows you to create and manage workspaces, clusters, notebooks, jobs, and other resources. It is hosted completely in the control plane of the classic Databricks architecture, which includes the backend services that Databricks manages in your Databricks account. The other options are part of the compute plane, which is where your data is processed by compute resources such as clusters. Thecompute plane is in your own cloud account and network. References: Databricks architecture overview, Security and Trust CenterQUESTION NO: 4

Which of the following benefits of using the Databricks Lakehouse Platform is provided by Delta Lake?

A. The ability to manipulate the same data using a variety of languages

B. The ability to collaborate in real time on a single notebook

C. The ability to set up alerts for query failures

D. The ability to support batch and streaming workloads

E. The ability to distribute complex data operations

Answer: D

Delta Lake is the optimized storage layer that provides the foundation for storing data and tables in the Databricks lakehouse. Delta Lake is fully compatible with Apache Spark APIs, and was developed for tight integration with Structured Streaming, allowing you to easily use a single copy of data for both batch and streaming operations and providing incremental processing at scale1. Delta Lake supports upserts using the merge operation, which enables you to efficiently update existing data or insert new data into your Delta tables2. Delta Lake also provides time travel capabilities, which allow you to query previous versions of your data or roll back to a specific point in time3. References: 1: What is Delta Lake? | Databricks on AWS 2: Upsert into a table using merge | Databricks on AWS 3: [Query an older snapshot of a table (time travel) | Databricks on AWS]

Learn more

Questions # 49:

Which of the following is stored in the Databricks customer's cloud account?

Options:

Databricks web application

Cluster management metadata

Repos

Data

Notebooks

Expert Solution

Answer

Questions # 50:

Identify the impact of ON VIOLATION DROP ROW and ON VIOLATION FAIL UPDATE for a constraint violation.

A data engineer has created an ETL pipeline using Delta Live table to manage their company travel reimbursement detail, they want to ensure that the if the location details has not been provided by the employee, the pipeline needs to be terminated.

How can the scenario be implemented?

Options:

CONSTRAINT valid_location EXPECT (location = NULL)

CONSTRAINT valid_location EXPECT (location != NULL) ON VIOLATION FAIL UPDATE

CONSTRAINT valid_location EXPECT (location != NULL) ON DROP ROW

CONSTRAINT valid_location EXPECT (location != NULL) ON VIOLATION FAIL

Expert Solution

Viewing page 5 out of 6 pages

Viewing questions 41-50 out of questions

Summer Certification Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Pass the Databricks Databricks-Certified-Data-Engineer-Associate Questions and answers with ValidTests

Exam Databricks-Certified-Data-Engineer-Associate Premium Access

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options: