Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Data-Engineer-Associate Exam Topic 6 Question 53 Discussion:

Question #: 53

Topic #: 6

A company receives a data file from a partner each day in an Amazon S3 bucket. The company uses a daily AW5 Glue extract, transform, and load (ETL) pipeline to clean and transform each data file. The output of the ETL pipeline is written to a CSV file named Dairy.csv in a second 53 bucket.
Occasionally, the daily data file is empty or is missing values for required fields. When the file is missing data, the company can use the previous day's CSV file.
A data engineer needs to ensure that the previous day's data file is overwritten only if the new daily file is complete and valid.
Which solution will meet these requirements with the LEAST effort?

Invoke an AWS Lambda function to check the file for missing data and to fill in missing values in required fields.

Configure the AWS Glue ETL pipeline to use AWS Glue Data Quality rules. Develop rules in Data Quality Definition Language (DQDL) to check for missing values in required files and empty files.

Use AWS Glue Studio to change the code in the ETL pipeline to fill in any missing values in the required fields with the most common values for each field.

Run a SQL query in Amazon Athena to read the CSV file and drop missing rows. Copy the corrected CSV file to the second S3 bucket.

Get Premium Data-Engineer-Associate Questions

Explanation

Problem Analysis:

The company runs a daily AWS Glue ETL pipeline to clean and transform files received in an S3 bucket.

If a file is incomplete or empty, the previous day’s file should be retained.

Need a solution to validate files before overwriting the existing file.

Key Considerations:

Automate data validation with minimal human intervention.

Use built-in AWS Glue capabilities for ease of integration.

Ensure robust validation for missing or incomplete data.

Solution Analysis:

Option A: Lambda Function for Validation

Lambda can validate files, but it would require custom code.

Does not leverage AWS Glue’s built-in features, adding operational complexity.

Option B: AWS Glue Data Quality Rules

AWS Glue Data Quality allows defining Data Quality Definition Language (DQDL) rules.

Rules can validate if required fields are missing or if the file is empty.

Automatically integrates into the existing ETL pipeline.

If validation fails, retain the previous day’s file.

Option C: AWS Glue Studio with Filling Missing Values

Modifying ETL code to fill missing values with most common values risks introducing inaccuracies.

Does not handle empty files effectively.

Option D: Athena Query for Validation

Athena can drop rows with missing values, but this is a post-hoc solution.

Requires manual intervention to copy the corrected file to S3, increasing complexity.

Final Recommendation:

Use AWS Glue Data Quality to define validation rules in DQDL for identifying missing or incomplete data.

This solution integrates seamlessly with the ETL pipeline and minimizes manual effort.

Implementation Steps:

Enable AWS Glue Data Quality in the existing ETL pipeline.

Define DQDL Rules, such as:

Check if a file is empty.

Verify required fields are present and non-null.

Configure the pipeline to proceed with overwriting only if the file passes validation.

In case of failure, retain the previous day’s file.

AWS Glue Data Quality Overview

Defining DQDL Rules

AWS Glue Studio Documentation

Actual exam question for Amazon Web Services Data-Engineer-Associate exam by Riven44614 at Nov 14, 2025, 3:51:20 PM

Contribute your Thoughts:

Chosen Answer: A B C D
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.

New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

Awaiting moderator approval