Databricks Certified Associate Developer for Apache Spark 3.5-Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 22 Topic 3 Discussion

Databricks Certified Associate Developer for Apache Spark 3.5-Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 22 Topic 3 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Topic 3 Question 22 Discussion:
Question #: 22
Topic #: 3

An engineer wants to join two DataFramesdf1anddf2on the respectiveemployee_idandemp_idcolumns:

df1:employee_id INT,name STRING

df2:emp_id INT,department STRING

The engineer uses:

result = df1.join(df2, df1.employee_id == df2.emp_id, how='inner')

What is the behaviour of the code snippet?


A.

The code fails to execute because the column names employee_id and emp_id do not match automatically


B.

The code fails to execute because it must use on='employee_id' to specify the join column explicitly


C.

The code fails to execute because PySpark does not support joining DataFrames with a different structure


D.

The code works as expected because the join condition explicitly matches employee_id from df1 with emp_id from df2


Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.