Databricks Certified Professional Data Scientist Exam Databricks-Certified-Professional-Data-Scientist Question # 12 Topic 2 Discussion

Databricks Certified Professional Data Scientist Exam Databricks-Certified-Professional-Data-Scientist Question # 12 Topic 2 Discussion

Databricks-Certified-Professional-Data-Scientist Exam Topic 2 Question 12 Discussion:
Question #: 12
Topic #: 2

You are building a classifier off of a very high-dimensiona data set similar to shown in the image with 5000 variables (lots of columns, not that many rows). It can handle both dense and sparse input. Which technique is most suitable, and why?

Databricks-Certified-Professional-Data-Scientist Question 12


A.

Logistic regression with L1 regularization, to prevent overfitting


B.

Naive Bayes, because Bayesian methods act as regularlizers


C.

k-nearest neighbors, because it uses local neighborhoods to classify examples


D.

Random forest because it is an ensemble method


Get Premium Databricks-Certified-Professional-Data-Scientist Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.