Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 22 Topic 3 Discussion

Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 22 Topic 3 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Topic 3 Question 22 Discussion:
Question #: 22
Topic #: 3

A data engineer observes that an upstream streaming source sends duplicate records, where duplicates share the same key and have at most a 30-minute difference in event_timestamp. The engineer adds:

dropDuplicatesWithinWatermark("event_timestamp", "30 minutes")

What is the result?


A.

It is not able to handle deduplication in this scenario


B.

It removes duplicates that arrive within the 30-minute window specified by the watermark


C.

It removes all duplicates regardless of when they arrive


D.

It accepts watermarks in seconds and the code results in an error


Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.