Databricks Certified Data Engineer Professional Exam Databricks-Certified-Professional-Data-Engineer Question # 38 Topic 4 Discussion

Databricks Certified Data Engineer Professional Exam Databricks-Certified-Professional-Data-Engineer Question # 38 Topic 4 Discussion

Databricks-Certified-Professional-Data-Engineer Exam Topic 4 Question 38 Discussion:
Question #: 38
Topic #: 4

A data engineer is configuring a pipeline that will potentially see late-arriving, duplicate records.

In addition to de-duplicating records within the batch, which of the following approaches allows the data engineer to deduplicate data against previously processed records as it is inserted into a Delta table?


A.

Set the configuration delta.deduplicate = true.


B.

VACUUM the Delta table after each batch completes.


C.

Perform an insert-only merge with a matching condition on a unique key.


D.

Perform a full outer join on a unique key and overwrite existing data.


E.

Rely on Delta Lake schema enforcement to prevent duplicate records.


Get Premium Databricks-Certified-Professional-Data-Engineer Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.