Pass the SAS Institute Statistical Business Analyst A00-240 Questions and answers with CertsForce

Viewing page 2 out of 3 pages
Viewing questions 11-20 out of questions
Questions # 11:

What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?

Options:

A.

It violates assumptions of the model.


B.

It requires extra computational effort and time.


C.

It omits the training (and test) data sets from the benefits of the cleansing methods.


D.

There is no ability to compare the effectiveness of different cleansing methods.


Expert Solution
Questions # 12:

A company has branch offices in eight regions. Customers within each region are classified as either "High Value" or "Medium Value" and are coded using the variable name VALUE. In the last year, the total amount of purchases per customer is used as the response variable.

Suppose there is a significant interaction between REGION and VALUE. What can you conclude?

Options:

A.

More high value customers are found in some regions than others.


B.

The difference between average purchases for medium and high value customers depends on the region.


C.

Regions with higher average purchases have more high value customers.


D.

Regions with higher average purchases have more medium value customers.


Expert Solution
Questions # 13:

Given the following output from the LOGISTIC procedure:

Question # 13

Which variables, among those that are statistically significant at an alpha of 0.05, have the greatest and least relative importance on the fitted model?

Options:

A.

Greatest: MBALeast: DOWN_AMT


B.

Greatest: MBALeast: CASH


C.

Greatest: DOWN_AMTLeast: CASH


D.

Greatest: DOWN_AMTLeast: HOME


Expert Solution
Questions # 14:

Refer to the lift chart:

Question # 14

At a depth of 0.1, Lift = 3.14. What does this mean?

Options:

A.

Selecting the top 10% of the population scored by the model should result in 3.14 times more events than a random draw of 10%.


B.

Selecting the observations with a response probability of at least 10% should result in 3.14 times more events than a random draw of 10%.


C.

Selecting the top 10% of the population scored by the model should result in 3.14 times greater accuracy than a random draw of 10%.


D.

Selecting the observations with a response probability of at least 10% should result in 3.14 times greater accuracy than a random draw of 10%.


Expert Solution
Questions # 15:

The total modeling data has been split into training, validation, and test data.

What is the best data to use for model assessment?

Options:

A.

Training data


B.

Total data


C.

Test data


D.

Validation data


Expert Solution
Questions # 16:

This question will ask you to provide a missing option. Given the following SAS program:

Question # 16

What option must be added to the program to obtain a data set containing Pearson statistics?

Options:

A.

OUTPUT=estimates


B.

OUTP=estimates


C.

OUTSTAT=estimates


D.

OUTCORR=estimates


Expert Solution
Questions # 17:

Which of the following describes a concordant pair of observations in the LOGISTIC procedure?

Options:

A.

An observation with the event has an equal probability as another observation with the event.


B.

An observation with the event has a lower predicted probability than the observation without the event.


C.

An observation with the event has an equal predicted probability as the observation without the event.


D.

An observation with the event has a higher predicted probability than the observation without the event


Expert Solution
Questions # 18:

A researcher is planning a logistic regression to model the probability of disease occurrence. The researcher determines the rate of disease occurrence in the population is 1%.

For which of the following would this study be a candidate?

Options:

A.

over fitting


B.

oversampling


C.

multicollinearity


D.

simple random sample


Expert Solution
Questions # 19:

Drag the adjustment formulas for oversamping from the left and place them into the correct location in the confusion matrix shown on the right.

Question # 19


Expert Solution
Questions # 20:

A researcher has several variables that could be possible predictors for the final model. There is interest in checking all 2-way interactions for possible entry to the model. The researcher has decided to use forward selection within PROC LOGISTIC. Fill in the missing code option that will ensure that all 2-way interactions will be considered for entry.

Question # 20

Options:

A.

start = 5


B.

include = 4


C.

include = 5


D.

start = 4


Expert Solution
Viewing page 2 out of 3 pages
Viewing questions 11-20 out of questions