Pass the Google Machine Learning Engineer Professional-Machine-Learning-Engineer Questions and answers with CertsForce

Viewing page 2 out of 8 pages
Viewing questions 11-20 out of questions
Questions # 11:

You are an ML engineer at a travel company. You have been researching customers’ travel behavior for many years, and you have deployed models that predict customers’ vacation patterns. You have observed that customers’ vacation destinations vary based on seasonality and holidays; however, these seasonal variations are similar across years. You want to quickly and easily store and compare the model versions and performance statistics across years. What should you do?

Options:

A.

Store the performance statistics in Cloud SQL. Query that database to compare the performance statistics across the model versions.


B.

Create versions of your models for each season per year in Vertex AI. Compare the performance statistics across the models in the Evaluate tab of the Vertex AI UI.


C.

Store the performance statistics of each pipeline run in Kubeflow under an experiment for each season per year. Compare the results across the experiments in the Kubeflow UI.


D.

Store the performance statistics of each version of your models using seasons and years as events in Vertex ML Metadata. Compare the results across the slices.


Expert Solution
Questions # 12:

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually

takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

Options:

A.

Use AI Platform to run distributed training jobs with checkpoints.


B.

Use AI Platform to run distributed training jobs without checkpoints.


C.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs with checkpoints.


D.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs without checkpoints.


Expert Solution
Questions # 13:

You are working on a classification problem with time series data and achieved an area under the receiver operating characteristic curve (AUC ROC) value of 99% for training data after just a few experiments. You haven’t explored using any sophisticated algorithms or spent any time on hyperparameter tuning. What should your next step be to identify and fix the problem?

Options:

A.

Address the model overfitting by using a less complex algorithm.


B.

Address data leakage by applying nested cross-validation during model training.


C.

Address data leakage by removing features highly correlated with the target value.


D.

Address the model overfitting by tuning the hyperparameters to reduce the AUC ROC value.


Expert Solution
Questions # 14:

You are training a custom language model for your company using a large dataset. You plan to use the ReductionServer strategy on Vertex Al. You need to configure the worker pools of the distributed training job. What should you do?

Options:

A.

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs Configure the third worker pool to have GPUs: and use the reduction server container image.


B.

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.


C.

Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators and choose a machine type that prioritizes bandwidth.


D.

Configure the machines of the first two pools to have TPUs. and to use a container image where your training code runs Configure the third pool to have TPUs: and use the reductionserver container image.


Expert Solution
Questions # 15:

You work for a company that sells corporate electronic products to thousands of businesses worldwide. Your company stores historical customer data in BigQuery. You need to build a model that predicts customer lifetime value over the next three years. You want to use the simplest approach to build the model. What should you do?

Options:

A.

Access BigQuery Studio in the Google Cloud console. Run the create model statement in the SQL editor to create an ARIMA model.


B.

Create a Vertex Al Workbench notebook. Use IPython magic to run the create model statement to create an ARIMA model.


C.

Access BigQuery Studio in the Google Cloud console. Run the create model statement in the SQL editor to create an AutoML regression model.


D.

Create a Vertex Al Workbench notebook. Use IPython magic to run the create model statement to create an AutoML regression model.


Expert Solution
Questions # 16:

You recently deployed a model to a Vertex Al endpoint Your data drifts frequently so you have enabled request-response logging and created a Vertex Al Model Monitoring job. You have observed that your model is receiving higher traffic than expected. You need to reduce the model monitoring cost while continuing to quickly detect drift. What should you do?

Options:

A.

Replace the monitoring job with a DataFlow pipeline that uses TensorFlow Data Validation (TFDV).


B.

Replace the monitoring job with a custom SQL scnpt to calculate statistics on the features and predictions in BigQuery.


C.

Decrease the sample_rate parameter in the Randomsampleconfig of the monitoring job.


D.

Increase the monitor_interval parameter in the scheduieconfig of the monitoring job.


Expert Solution
Questions # 17:

You have trained a deep neural network model on Google Cloud. The model has low loss on the training data, but is performing worse on the validation data. You want the model to be resilient to overfitting. Which strategy should you use when retraining the model?

Options:

A.

Apply a dropout parameter of 0 2, and decrease the learning rate by a factor of 10


B.

Apply a L2 regularization parameter of 0.4, and decrease the learning rate by a factor of 10.


C.

Run a hyperparameter tuning job on Al Platform to optimize for the L2 regularization and dropout parameters


D.

Run a hyperparameter tuning job on Al Platform to optimize for the learning rate, and increase the number of neurons by a factor of 2.


Expert Solution
Questions # 18:

You work for a company that is developing an application to help users with meal planning You want to use machine learning to scan a corpus of recipes and extract each ingredient (e g carrot, rice pasta) and each kitchen cookware (e.g. bowl, pot spoon) mentioned Each recipe is saved in an unstructured text file What should you do?

Options:

A.

Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient" and cookware" and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.


B.

Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model’s performance on a holdout dataset.


C.

Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.


D.

Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.


Expert Solution
Questions # 19:

You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (Pll) to Google Cloud. You want to use the Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the Pll is not accessible by unauthorized individuals?

Options:

A.

Stream all files to Google CloudT and then write the data to BigQuery Periodically conduct a bulk scan of the table using the DLP API.


B.

Stream all files to Google Cloud, and write batches of the data to BigQuery While the data is being written to BigQuery conduct a bulk scan of the data using the DLP API.


C.

Create two buckets of data Sensitive and Non-sensitive Write all data to the Non-sensitive bucket Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket


D.

Create three buckets of data: Quarantine, Sensitive, and Non-sensitive Write all data to the Quarantine bucket.


E.

Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket


Expert Solution
Questions # 20:

You are developing ML models with Al Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?

Options:

A.

Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job


B.

Use the gcloud command-line tool to submit training jobs on Al Platform when you update your code


C.

Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository


D.

Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.


Expert Solution
Viewing page 2 out of 8 pages
Viewing questions 11-20 out of questions