Explanation
Correct code block:
transactionsDf.withColumn("transactionDateForm", from_unixtime("transactionDate", "MMM d (EEEE)"))
The QUESTION NO: specifically asks about "adding" a column. In the context of all presented answers, DataFrame.withColumn() is the correct command for this. In theory, DataFrame.select() could
also be
used for this purpose, if all existing columns are selected and a new one is added. DataFrame.withColumnRenamed() is not the appropriate command, since it can only rename existing columns, but
cannot add a new column or change the value of a column.
Once DataFrame.withColumn() is chosen, you can read in the documentation (see below) that the first input argument to the method should be the column name of the new column.
The final difficulty is the date format. The QUESTION NO: indicates that the date format Apr 26 (Sunday) is desired. The answers give "MMM d (EEEE)" and "MM d (EEE)" as options. It can be hard
to
know the details of the date format that is used in Spark. Specifically, knowing the differences between MMM and MM is probably not something you deal with every day. But, there is an easy way
to remember the difference: M (one letter) is usually the shortest form: 4 for April. MM includes padding: 04 for April. MMM (three letters) is the three-letter month abbreviation: Apr for April. And
MMMM is the longest possible form: April. Knowing this four-letter sequence helps you select the correct option here.
More info: pyspark.sql.DataFrame.withColumn — PySpark 3.1.2 documentation
Static notebook | Dynamic notebook: See test 3, QUESTION NO: 35 (Databricks import instructions)
Submit