Pass the Databricks Databricks Certification Databricks-Certified-Professional-Data-Scientist Questions and answers with CertsForce

Viewing page 1 out of 5 pages
Viewing questions 1-10 out of questions
Questions # 1:

In which of the following scenario we can use naTve Bayes theorem for classification

Options:

A.

Classify whether a given person is a male or a female based on the measured features. The features include height, weight and foot size.


B.

To classify whether an email is spam or not spam


C.

To identify whether a fruit is an orange or not based on features like diameter, color and shape


Expert Solution
Questions # 2:

RMSE is a useful metric for evaluating which types of models?

Options:

A.

Logistic regression


B.

Naive Bayes classifier


C.

Linear regression


D.

All of the above


Expert Solution
Questions # 3:

What are the key outcomes of the successful analytical projects?

Options:

A.

Code of the model


B.

Technical specifications


C.

Presentations for the Analysts


D.

Presentation for Project Sponsors


Expert Solution
Questions # 4:

Which of the below best describe the Principal component analysis

Options:

A.

Dimensionality reduction


B.

Collaborative filtering


C.

Classification


D.

Regression


E.

Clustering


Expert Solution
Questions # 5:

Suppose you have made a model for the rating system, which rates between 1 to 5 stars. And you calculated that RMSE value is 1.0 then which of the following is correct

Options:

A.

It means that your predictions are on average one star off of what people really think


B.

It means that your predictions are on average two star off of what people really think


C.

It means that your predictions are on average three star off of what people really think


D.

It means that your predictions are on average four star off of what people really think


Expert Solution
Questions # 6:

Which of the following are point estimation methods?

Options:

A.

MAP


B.

MLE


C.

MMSE


Expert Solution
Questions # 7:

Which of the following true with regards to the K-Means clustering algorithm?

Options:

A.

Labels are not pre-assigned to each objects in the cluster.


B.

Labels are pre-assigned to each objects in the cluster.


C.

It classify the data based on the labels.


D.

It discovers the center of each cluster.


E.

It find each objects fall in which particular cluster


Expert Solution
Questions # 8:

Regularization is a very important technique in machine learning to prevent overfitting. Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is...

Options:

A.

L2 is the sum of the square of the weights, while L1 is just the sum of the weights


B.

L1 is the sum of the square of the weights, while L2 is just the sum of the weights


C.

L1 gives Non-sparse output while L2 gives sparse outputs


D.

None of the above


Expert Solution
Questions # 9:

What is one modeling or descriptive statistical function in MADlib that is typically not provided in a standard relational database?

Options:

A.

Expected value


B.

Variance


C.

Linear regression


D.

Quantiles


Expert Solution
Questions # 10:

Select the sequence of the developing machine learning applications

A) Analyze the input data

B) Prepare the input data

C) Collect data

D) Train the algorithm

E) Test the algorithm

F) Use It

Options:

A.

A, B, C, D, E, F


B.

C, B, A, D, E, F


C.

C, A, B, D, E, F


D.

C, B, A, D, E, F


Expert Solution
Viewing page 1 out of 5 pages
Viewing questions 1-10 out of questions