Pass the Databricks Databricks Certification Databricks-Certified-Professional-Data-Scientist Questions and answers with CertsForce

Viewing page 2 out of 5 pages
Viewing questions 11-20 out of questions
Questions # 11:

What type of output generated in case of linear regression?

Options:

A.

Continuous variable


B.

Discrete Variable


C.

Any of the Continuous and Discrete variable


D.

Values between 0 and 1


Expert Solution
Questions # 12:

You are building a classifier off of a very high-dimensiona data set similar to shown in the image with 5000 variables (lots of columns, not that many rows). It can handle both dense and sparse input. Which technique is most suitable, and why?

Question # 12

Options:

A.

Logistic regression with L1 regularization, to prevent overfitting


B.

Naive Bayes, because Bayesian methods act as regularlizers


C.

k-nearest neighbors, because it uses local neighborhoods to classify examples


D.

Random forest because it is an ensemble method


Expert Solution
Questions # 13:

In statistics, maximum-likelihood estimation (MLE) is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters and the normalizing constant usually ignored in MLEs because

Options:

A.

The normalizing constant is always very close to 1


B.

The normalizing constant only has a small impact on the maximum likelihood


C.

The normalizing constant is often zero and can cause division by zero


D.

The normalizing constant doesn't impact the maximizing value


Expert Solution
Questions # 14:

You are creating a Classification process where input is the income, education and current debt of a customer, what could be the possible output of this process.

Options:

A.

Probability of the customer default on loan repayment


B.

Percentage of the customer loan repayment capability


C.

Percentage of the customer should be given loan or not


D.

The output might be a risk class, such as "good", "acceptable", "average", or "unacceptable".


Expert Solution
Questions # 15:

Which of the following technique can be used to the design of recommender systems?

Options:

A.

Naive Bayes classifier


B.

Power iteration


C.

Collaborative filtering


D.

1 and 3


E.

2 and 3


Expert Solution
Questions # 16:

Select the correct statement which applies to logistic regression

Options:

A.

Computationally inexpensive, easy to implement knowledge representation easy to interpret


B.

May have low accuracy


C.

Works with Numeric values


D.

Only 1 and 3 are correct


E.

All 1, 2 and 3 are correct


Expert Solution
Questions # 17:

You are working on a problem where you have to predict whether the claim is done valid or not. And you find that most of the claims which are having spelling errors as well as corrections in the manually filled claim forms compare to the honest claims. Which of the following technique is suitable to find out whether the claim is valid or not?

Options:

A.

Naive Bayes


B.

Logistic Regression


C.

Random Decision Forests


D.

Any one of the above


Expert Solution
Questions # 18:

Reducing the data from many features to a small number so that we can properly visualize it in

two or three dimensions. It is done in_______

Options:

A.

supervised learning


B.

un-supervised learning


C.

k-Nearest Neighbors


D.

Support vector machines


Expert Solution
Questions # 19:

Consider the following confusion matrix for a data set with 600 out of 11,100 instances positive:

In this case, Precision = 50%, Recall = 83%, Specificity = 95%, and Accuracy = 95%.

Select the correct statement

Question # 19

Options:

A.

Precision is low, which means the classifier is predicting positives best


B.

Precision is low, which means the classifier is predicting positives poorly


C.

problem domain has a major impact on the measures that should be used to evaluate a classifier within it


D.

1 and 3


E.

2 and 3


Expert Solution
Questions # 20:

Select the correct problems which can be solved using SVMs

Options:

A.

SVMs are helpful in text and hypertext categorization


B.

Classification of images can also be performed using SVMs


C.

SVMs are also useful in medical science to classify proteins with up to 90% of the compounds classified correctly


D.

Hand-written characters can be recognized using SVM


Expert Solution
Viewing page 2 out of 5 pages
Viewing questions 11-20 out of questions