Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Questions Free Practice Test

Viewing page 3 out of 3 pages

Viewing questions 21-30 out of questions

Questions # 21:

A Spark application is experiencing performance issues in client mode because the driver is resource-constrained.

How should this issue be resolved?

Options:

Add more executor instances to the cluster

Increase the driver memory on the client machine

Switch the deployment mode to cluster mode

Switch the deployment mode to local mode

Expert Solution

Questions # 22:

An engineer wants to join two DataFramesdf1anddf2on the respectiveemployee_idandemp_idcolumns:

df1:employee_id INT,name STRING

df2:emp_id INT,department STRING

The engineer uses:

result = df1.join(df2, df1.employee_id == df2.emp_id, how='inner')

What is the behaviour of the code snippet?

Options:

The code fails to execute because the column names employee_id and emp_id do not match automatically

The code fails to execute because it must use on='employee_id' to specify the join column explicitly

The code fails to execute because PySpark does not support joining DataFrames with a different structure

The code works as expected because the join condition explicitly matches employee_id from df1 with emp_id from df2

Expert Solution

Questions # 23:

A developer runs:

Question # 23

What is the result?

Options:

It stores all data in a single Parquet file.

It throws an error if there are null values in either partition column.

It appends new partitions to an existing Parquet file.

It creates separate directories for each unique combination of color and fruit.

Expert Solution

Questions # 24:

Which Spark configuration controls the number of tasks that can run in parallel on the executor?

Options:

spark.executor.cores

spark.task.maxFailures

spark.driver.cores

spark.executor.memory

Expert Solution

Questions # 25:

A data engineer needs to write a DataFramedfto a Parquet file, partitioned by the columncountry, and overwrite any existing data at the destination path.

Which code should the data engineer use to accomplish this task in Apache Spark?

Options:

df.write.mode("overwrite").partitionBy("country").parquet("/data/output")

df.write.mode("append").partitionBy("country").parquet("/data/output")

df.write.mode("overwrite").parquet("/data/output")

df.write.partitionBy("country").parquet("/data/output")

Expert Solution

Viewing page 3 out of 3 pages

Viewing questions 21-30 out of questions

Pass the Databricks Databricks Certification Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions and answers with CertsForce