New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 18 Topic 2 Discussion

Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 18 Topic 2 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Topic 2 Question 18 Discussion:
Question #: 18
Topic #: 2

Given the schema:

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question 18

event_ts TIMESTAMP,

sensor_id STRING,

metric_value LONG,

ingest_ts TIMESTAMP,

source_file_path STRING

The goal is to deduplicate based on: event_ts, sensor_id, and metric_value.

Options:


A.

dropDuplicates on all columns (wrong criteria)


B.

dropDuplicates with no arguments (removes based on all columns)


C.

groupBy without aggregation (invalid use)


D.

dropDuplicates on the exact matching fields


Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.