Google Professional Machine Learning Engineer Professional-Machine-Learning-Engineer Question # 24 Topic 3 Discussion

Google Professional Machine Learning Engineer Professional-Machine-Learning-Engineer Question # 24 Topic 3 Discussion

Professional-Machine-Learning-Engineer Exam Topic 3 Question 24 Discussion:
Question #: 24
Topic #: 3

You are experimenting with a built-in distributed XGBoost model in Vertex AI Workbench user-managed notebooks. You use BigQuery to split your data into training and validation sets using the following queries:

CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.8);

CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.2);

After training the model, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8, but after deploying the model to production, you notice that your model performance has dropped to an AUC ROC value of 0.65. What problem is most likely occurring?


A.

There is training-serving skew in your production environment.


B.

There is not a sufficient amount of training data.


C.

The tables that you created to hold your training and validation records share some records, and you may not be using all the data in your initial table.


D.

The RAND() function generated a number that is less than 0.2 in both instances, so every record in the validation table will also be in the training table.


Get Premium Professional-Machine-Learning-Engineer Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.