Big Halloween Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 9 Topic 1 Discussion

Databricks Certified Associate Developer for Apache Spark 3.5 – Python Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 9 Topic 1 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Topic 1 Question 9 Discussion:
Question #: 9
Topic #: 1

An MLOps engineer is building a Pandas UDF that applies a language model that translates English strings into Spanish. The initial code is loading the model on every call to the UDF, which is hurting the performance of the data pipeline.

The initial code is:

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question 9

def in_spanish_inner(df: pd.Series) -> pd.Series:

model = get_translation_model(target_lang='es')

return df.apply(model)

in_spanish = sf.pandas_udf(in_spanish_inner, StringType())

How can the MLOps engineer change this code to reduce how many times the language model is loaded?


A.

Convert the Pandas UDF to a PySpark UDF


B.

Convert the Pandas UDF from a Series → Series UDF to a Series → Scalar UDF


C.

Run the in_spanish_inner() function in a mapInPandas() function call


D.

Convert the Pandas UDF from a Series → Series UDF to an Iterator[Series] → Iterator[Series] UDF


Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.