A data scientist is performing a linear regression and wants to construct a model that explains the most variation in the data. Which of the following should the data scientist maximize when evaluating the regression performance metrics?
→ R² (coefficient of determination) quantifies how much of the variance in the dependent variable is explained by the model. A higher R² means a better fit to the data, making it the metric to maximize for explanatory power in regression analysis.
Why the other options are incorrect:
A: Accuracy is used in classification, not regression.
C: p-values test statistical significance of coefficients, not overall model fit.
D: AUC (Area Under the Curve) applies to classification models, not regression.
Official References:
CompTIA DataX (DY0-001) Study Guide – Section 3.2:“R² is a regression performance metric indicating the proportion of variance explained by the independent variables.”
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit