Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 41 Topic 5 Discussion

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 41 Topic 5 Discussion

Data-Engineer-Associate Exam Topic 5 Question 41 Discussion:
Question #: 41
Topic #: 5

A company has a data lake in Amazon 53. The company uses AWS Glue to catalog data and AWS Glue Studio to implement data extract, transform, and load (ETL) pipelines.

The company needs to ensure that data quality issues are checked every time the pipelines run. A data engineer must enhance the existing pipelines to evaluate data quality rules based on predefined thresholds.

Which solution will meet these requirements with the LEAST implementation effort?


A.

Add a new transform that is defined by a SQL query to each Glue ETL job. Use the SQL query to implement a ruleset that includes the data quality rules that need to be evaluated.


B.

Add a new Evaluate Data Quality transform to each Glue ETL job. Use Data Quality Definition Language (DQDL) to implement a ruleset that includes the data quality rules that need to be evaluated.


C.

Add a new custom transform to each Glue ETL job. Use the PyDeequ library to implement a ruleset that includes the data quality rules that need to be evaluated.


D.

Add a new custom transform to each Glue ETL job. Use the Great Expectations library to implement a ruleset that includes the data quality rules that need to be evaluated.


Get Premium Data-Engineer-Associate Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.