Explanation
Correct code block:
len(spark.read.csv(filePath, comment='#').columns)
This is a challenging QUESTION NO: with difficulties in an unusual context: The boundary between DataFrame and the DataFrameReader. It is unlikely that a QUESTION NO: of this difficulty level
appears in the
exam. However, solving it helps you get more comfortable with the DataFrameReader, a subject you will likely have to deal with in the exam.
Before dealing with the inner parentheses, it is easier to figure out the outer parentheses, gaps 1 and 5. Given the code block, the object in gap 5 would have to be evaluated by the object in gap 1,
returning the number of columns in the read-in CSV. One answer option includes DataFrame in gap 1 and shape[0] in gap 2. Since DataFrame cannot be used to evaluate shape[0], we can discard
this answer option.
Other answer options include size in gap 1. size() is not a built-in Python command, so if we use it, it would have to come from somewhere else. pyspark.sql.functions includes a size() method, but
this method only returns the length of an array or map stored within a column (documentation linked below). So, using a size() method is not an option here. This leaves us with two potentially valid
answers.
We have to pick between gaps 2 and 3 being spark.read or pyspark.DataFrameReader. Looking at the documentation (linked below), the DataFrameReader is actually a child class of pyspark.sql,
which means that we cannot import it using pyspark.DataFrameReader. Moreover, spark.read makes sense because on Databricks, spark references current Spark session
(pyspark.sql.SparkSession) and spark.read therefore returns a DataFrameReader (also see documentation below). Finally, there is only one correct answer option remaining.
More info:
- pyspark.sql.functions.size — PySpark 3.1.2 documentation
- pyspark.sql.DataFrameReader.csv — PySpark 3.1.2 documentation
- pyspark.sql.SparkSession.read — PySpark 3.1.2 documentation
Static notebook | Dynamic notebook: See test 3, QUESTION NO: 50 (Databricks import instructions)
Submit