Pre-Summer Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Pass the Databricks Databricks-Certified-Data-Engineer-Associate Questions and answers with ValidTests

Exam Databricks-Certified-Data-Engineer-Associate All Questions
Exam Databricks-Certified-Data-Engineer-Associate Premium Access

View all detail and faqs for the Databricks-Certified-Data-Engineer-Associate exam

Viewing page 3 out of 6 pages
Viewing questions 21-30 out of questions
Questions # 21:

Which method should a Data Engineer apply to ensure Workflows are being triggered on schedule?

Options:

A.

Scheduled Workflows require an always-running cluster, which is more expensive but reduces processing latency.

B.

Scheduled Workflows process data as it arrives at configured sources.

C.

Scheduled Workflows can reduce resource consumption and expense since the cluster runs only long enough to execute the pipeline.

D.

Scheduled Workflows run continuously until manually stopped.

Expert Solution
Questions # 22:

A data organization leader is upset about the data analysis team’s reports being different from the data engineering team’s reports. The leader believes the siloed nature of their organization’s data engineering and data analysis architectures is to blame.

Which of the following describes how a data lakehouse could alleviate this issue?

Options:

A.

Both teams would autoscale their work as data size evolves

B.

Both teams would use the same source of truth for their work

C.

Both teams would reorganize to report to the same department

D.

Both teams would be able to collaborate on projects in real-time

E.

Both teams would respond more quickly to ad-hoc requests

Expert Solution
Questions # 23:

A data engineering team has two tables. The first table march_transactions is a collection of all retail transactions in the month of March. The second table april_transactions is a collection of all retail transactions in the month of April. There are no duplicate records between the tables.

Which of the following commands should be run to create a new table all_transactions that contains all records from march_transactions and april_transactions without duplicate records?

Options:

A.

CREATE TABLE all_transactions AS

SELECT * FROM march_transactions

INNER JOIN SELECT * FROM april_transactions;

B.

CREATE TABLE all_transactions AS

SELECT * FROM march_transactions

UNION SELECT * FROM april_transactions;

C.

CREATE TABLE all_transactions AS

SELECT * FROM march_transactions

OUTER JOIN SELECT * FROM april_transactions;

D.

CREATE TABLE all_transactions AS

SELECT * FROM march_transactions

INTERSECT SELECT * from april_transactions;

E.

CREATE TABLE all_transactions AS

SELECT * FROM march_transactions

MERGE SELECT * FROM april_transactions;

Expert Solution
Questions # 24:

Which of the following data lakehouse features results in improved data quality over a traditional data lake?

Options:

A.

A data lakehouse provides storage solutions for structured and unstructured data.

B.

A data lakehouse supports ACID-compliant transactions.

C.

A data lakehouse allows the use of SQL queries to examine data.

D.

A data lakehouse stores data in open formats.

E.

A data lakehouse enables machine learning and artificial Intelligence workloads.

Expert Solution
Questions # 25:

A new data engineering team team. has been assigned to an ELT project. The new data engineering team will need full privileges on the database customers to fully manage the project.

Which of the following commands can be used to grant full permissions on the database to the new data engineering team?

Options:

A.

GRANT USAGE ON DATABASE customers TO team;

B.

GRANT ALL PRIVILEGES ON DATABASE team TO customers;

C.

GRANT SELECT PRIVILEGES ON DATABASE customers TO teams;

D.

GRANT SELECT CREATE MODIFY USAGE PRIVILEGES ON DATABASE customers TO team;

E.

GRANT ALL PRIVILEGES ON DATABASE customers TO team;

Expert Solution
Questions # 26:

A dataset has been defined using Delta Live Tables and includes an expectations clause:

CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

Options:

A.

Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.

B.

Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.

C.

Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.

D.

Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.

E.

Records that violate the expectation cause the job to fail.

Expert Solution
Questions # 27:

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.

The cade block used by the data engineer is below:

Question # 27

If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?

Options:

A.

trigger("5 seconds")

B.

trigger()

C.

trigger(once="5 seconds")

D.

trigger(processingTime="5 seconds")

E.

trigger(continuous="5 seconds")

Expert Solution
Questions # 28:

Which of the following benefits is provided by the array functions from Spark SQL?

Options:

A.

An ability to work with data in a variety of types at once

B.

An ability to work with data within certain partitions and windows

C.

An ability to work with time-related data in specified intervals

D.

An ability to work with complex, nested data ingested from JSON files

E.

An ability to work with an array of tables for procedural automation

Expert Solution
Questions # 29:

A data engineer needs access to a table new_table, but they do not have the correct permissions. They can ask the table owner for permission, but they do not know who the table owner is.

Which of the following approaches can be used to identify the owner of new_table?

Options:

A.

Review the Permissions tab in the table's page in Data Explorer

B.

All of these options can be used to identify the owner of the table

C.

Review the Owner field in the table's page in Data Explorer

D.

Review the Owner field in the table's page in the cloud storage solution

E.

There is no way to identify the owner of the table

Expert Solution
Questions # 30:

Which SQL keyword can be used to convert a table from a long format to a wide format?

Options:

A.

TRANSFORM

B.

PIVOT

C.

SUM

D.

CONVERT

Expert Solution
Viewing page 3 out of 6 pages
Viewing questions 21-30 out of questions