Databricks Certified Associate Developer for Apache Spark 3.0 Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Question # 17 Topic 2 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Exam Topic 2 Question 17 Discussion:

Question #: 17

Topic #: 2

Which of the following code blocks creates a new one-column, two-row DataFrame dfDates with column date of type timestamp?

1.dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"])

2.dfDates = dfDates.withColumn("date", to_timestamp("dd/MM/yyyy HH:mm:ss", "date"))

1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

2.dfDates = dfDates.withColumnRenamed("date", to_timestamp("date", "yyyy-MM-dd HH:mm:ss"))

1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

2.dfDates = dfDates.withColumn("date", to_timestamp("date", "dd/MM/yyyy HH:mm:ss"))

1.dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"])

2.dfDates = dfDates.withColumnRenamed("date", to_datetime("date", "yyyy-MM-dd HH:mm:ss"))

1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Questions

Explanation

This QUESTION NO: is tricky. Two things are important to know here:

First, the syntax for createDataFrame: Here you need a list of tuples, like so: [(1,), (2,)]. To define a tuple in Python, if you just have a single item in it, it is important to put a comma after the item so

that Python interprets it as a tuple and not just a normal parenthesis.

Second, you should understand the to_timestamp syntax. You can find out more about it in the documentation linked below.

For good measure, let's examine in detail why the incorrect options are wrong:

dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

This code snippet does everything the QUESTION NO: asks for – except that the data type of the date column is a string and not a timestamp. When no schema is specified, Spark sets the string

data type as default.

dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"])

dfDates = dfDates.withColumn("date", to_timestamp("dd/MM/yyyy HH:mm:ss", "date"))

In the first row of this command, Spark throws the following error: TypeError: Can not infer schema for type: . This is because Spark expects to find row information, but instead finds

strings. This is why you need to specify the data as tuples. Fortunately, the Spark documentation (linked below) shows a number of examples for creating DataFrames that should help you get on

the right track here.

dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

dfDates = dfDates.withColumnRenamed("date", to_timestamp("date", "yyyy-MM-dd HH:mm:ss"))

The issue with this answer is that the operator withColumnRenamed is used. This operator simply renames a column, but it has no power to modify its actual content. This is why withColumn should

be used instead. In addition, the date format yyyy-MM-dd HH:mm:ss does not reflect the format of the actual timestamp: "23/01/2022 11:28:12".

dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"])

dfDates = dfDates.withColumnRenamed("date", to_datetime("date", "yyyy-MM-dd HH:mm:ss"))

Here, withColumnRenamed is used instead of withColumn (see above). In addition, the rows are not expressed correctly – they should be written as tuples, using parentheses. Finally, even the date

format is off here (see above).

More info: pyspark.sql.functions.to_timestamp — PySpark 3.1.2 documentation and pyspark.sql.SparkSession.createDataFrame — PySpark 3.1.1 documentation

Static notebook | Dynamic notebook: See test 2, QUESTION NO: 38 (Databricks import instructions)

Actual exam question for Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 exam by Draven4330 at Aug 3, 2025, 12:00:00 AM

Contribute your Thoughts:

Chosen Answer: A B C D E
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.

Databricks Certified Associate Developer for Apache Spark 3.0 Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Question # 17 Topic 2 Discussion

Databricks Certified Associate Developer for Apache Spark 3.0 Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Question # 17 Topic 2 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

Databricks Certified Associate Developer for Apache Spark 3.0 Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Question # 17 Topic 2 Discussion

Databricks Certified Associate Developer for Apache Spark 3.0 Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Question # 17 Topic 2 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

Awaiting moderator approval