Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Exam Data-Engineer-Associate All Questions
Exam Data-Engineer-Associate All Questions

View all questions & answers for the Data-Engineer-Associate exam

Amazon Web Services AWS Certified Data Engineer Data-Engineer-Associate Question # 17 Topic 1 Discussion

Data-Engineer-Associate Exam Topic 1 Question 17 Discussion:
Question #: 17
Topic #: 1

A company receives a data file from a partner each day in an Amazon S3 bucket. The company uses a daily AW5 Glue extract, transform, and load (ETL) pipeline to clean and transform each data file. The output of the ETL pipeline is written to a CSV file named Dairy.csv in a second 53 bucket.

Occasionally, the daily data file is empty or is missing values for required fields. When the file is missing data, the company can use the previous day's CSV file.

A data engineer needs to ensure that the previous day's data file is overwritten only if the new daily file is complete and valid.

Which solution will meet these requirements with the LEAST effort?


A.

Invoke an AWS Lambda function to check the file for missing data and to fill in missing values in required fields.


B.

Configure the AWS Glue ETL pipeline to use AWS Glue Data Quality rules. Develop rules in Data Quality Definition Language (DQDL) to check for missing values in required files and empty files.


C.

Use AWS Glue Studio to change the code in the ETL pipeline to fill in any missing values in the required fields with the most common values for each field.


D.

Run a SQL query in Amazon Athena to read the CSV file and drop missing rows. Copy the corrected CSV file to the second S3 bucket.


Get Premium Data-Engineer-Associate Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.