Pass the Google Machine Learning Engineer Professional-Machine-Learning-Engineer Questions and answers with CertsForce

Viewing page 8 out of 8 pages
Viewing questions 71-80 out of questions
Questions # 71:

You are developing a model to predict whether a failure will occur in a critical machine part. You have a dataset consisting of a multivariate time series and labels indicating whether the machine part failed You recently started experimenting with a few different preprocessing and modeling approaches in a Vertex Al Workbench notebook. You want to log data and track artifacts from each run. How should you set up your experiments?

Options:

A.

B.

C.

D.

71


Expert Solution
Questions # 72:

You are designing an ML recommendation model for shoppers on your company's ecommerce website. You will use Recommendations Al to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?

Options:

A.

Use the "Other Products You May Like" recommendation type to increase the click-through rate


B.

Use the "Frequently Bought Together' recommendation type to increase the shopping cart size for each order.


C.

Import your user events and then your product catalog to make sure you have the highest quality event stream


D.

Because it will take time to collect and record product data, use placeholder values for the product catalog to test the viability of the model.


Expert Solution
Questions # 73:

Your company manages a video sharing website where users can watch and upload videos. You need to

create an ML model to predict which newly uploaded videos will be the most popular so that those videos can be prioritized on your company’s website. Which result should you use to determine whether the model is successful?

Options:

A.

The model predicts videos as popular if the user who uploads them has over 10,000 likes.


B.

The model predicts 97.5% of the most popular clickbait videos measured by number of clicks.


C.

The model predicts 95% of the most popular videos measured by watch time within 30 days of being

uploaded.


D.

The Pearson correlation coefficient between the log-transformed number of views after 7 days and 30 days after publication is equal to 0.


Expert Solution
Questions # 74:

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?

Options:

A.

A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM


B.

A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM


C.

A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM


D.

A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM


Expert Solution
Questions # 75:

You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn't changed; however the accuracy of the model has steadily deteriorated. What issue is most likely causing the steady decline in model accuracy?

Options:

A.

Poor data quality


B.

Lack of model retraining


C.

Too few layers in the model for capturing information


D.

Incorrect data split ratio during model training, evaluation, validation, and test


Expert Solution
Questions # 76:

You are an ML engineer responsible for designing and implementing training pipelines for ML models. You need to create an end-to-end training pipeline for a TensorFlow model. The TensorFlow model will be trained on several terabytes of structured data. You need the pipeline to include data quality checks before training and model quality checks after training but prior to deployment. You want to minimize development time and the need for infrastructure maintenance. How should you build and orchestrate your training pipeline?

Options:

A.

Create the pipeline using Kubeflow Pipelines domain-specific language (DSL) and predefined Google Cloud components. Orchestrate the pipeline using Vertex AI Pipelines.


B.

Create the pipeline using TensorFlow Extended (TFX) and standard TFX components. Orchestrate the pipeline using Vertex AI Pipelines.


C.

Create the pipeline using Kubeflow Pipelines domain-specific language (DSL) and predefined Google Cloud components. Orchestrate the pipeline using Kubeflow Pipelines deployed on Google Kubernetes Engine.


D.

Create the pipeline using TensorFlow Extended (TFX) and standard TFX components. Orchestrate the pipeline using Kubeflow Pipelines deployed on Google Kubernetes Engine.


Expert Solution
Questions # 77:

You trained a model on data stored in a Cloud Storage bucket. The model needs to be retrained frequently in Vertex AI Training using the latest data in the bucket. Data preprocessing is required prior to retraining. You want to build a simple and efficient near-real-time ML pipeline in Vertex AI that will preprocess the data when new data arrives in the bucket. What should you do?

Options:

A.

Create a pipeline using the Vertex AI SDK. Schedule the pipeline with Cloud Scheduler to preprocess the new data in the bucket. Store the processed features in Vertex AI Feature Store.


B.

Create a Cloud Run function that is triggered when new data arrives in the bucket. The function initiates a Vertex AI Pipeline to preprocess the new data and store the processed features in Vertex AI Feature Store.


C.

Build a Dataflow pipeline to preprocess the new data in the bucket and store the processed features in BigQuery. Configure a cron job to trigger the pipeline execution.


D.

Use the Vertex AI SDK to preprocess the new data in the bucket prior to each model retraining. Store the processed features in BigQuery.


Expert Solution
Questions # 78:

You are developing a training pipeline for a new XGBoost classification model based on tabular data The data is stored in a BigQuery table You need to complete the following steps

1. Randomly split the data into training and evaluation datasets in a 65/35 ratio

2. Conduct feature engineering

3 Obtain metrics for the evaluation dataset.

4 Compare models trained in different pipeline executions

How should you execute these steps'?

Options:

A.

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2. Enable auto logging of metrics in the training component.

3 Compare pipeline runs in Vertex Al Experiments


B.

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2 Enable autologging of metrics in the training component

3 Compare models using the artifacts lineage in Vertex ML Metadata


C.

1 In BigQuery ML. use the create model statement with bocstzd_tree_classifier as the model

type and use BigQuery to handle the data splits.

2 Use a SQL view to apply feature engineering and train the model using the data in that view

3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_infc statement.


D.

1 In BigQuery ML use the create model statement with boosted_tree_classifier as the model

type, and use BigQuery to handle the data splits.

2 Use ml transform to specify the feature engineering transformations, and train the model using the

data in the table

' 3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_info statement.


Expert Solution
Questions # 79:

You are an ML engineer at a manufacturing company You are creating a classification model for a predictive maintenance use case You need to predict whether a crucial machine will fail in the next three days so that the repair crew has enough time to fix the machine before it breaks. Regular maintenance of the machine is relatively inexpensive, but a failure would be very costly You have trained several binary classifiers to predict whether the machine will fail. where a prediction of 1 means that the ML model predicts a failure.

You are now evaluating each model on an evaluation dataset. You want to choose a model that prioritizes detection while ensuring that more than 50% of the maintenance jobs triggered by your model address an imminent machine failure. Which model should you choose?

Options:

A.

The model with the highest area under the receiver operating characteristic curve (AUC ROC) and precision greater than 0 5


B.

The model with the lowest root mean squared error (RMSE) and recall greater than 0.5.


C.

The model with the highest recall where precision is greater than 0.5.


D.

The model with the highest precision where recall is greater than 0.5.


Expert Solution
Viewing page 8 out of 8 pages
Viewing questions 71-80 out of questions