Pass the Google Machine Learning Engineer Professional-Machine-Learning-Engineer Questions and answers with CertsForce

Viewing page 4 out of 8 pages
Viewing questions 31-40 out of questions
Questions # 31:

You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using Al Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without significantly compromising its effectiveness. Which actions should you take?

Choose 2 answers

Options:

A.

Decrease the number of parallel trials


B.

Decrease the range of floating-point values


C.

Set the early stopping parameter to TRUE


D.

Change the search algorithm from Bayesian search to random search.


E.

Decrease the maximum number of trials during subsequent training phases.


Expert Solution
Questions # 32:

You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada Weather data is published weekly and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost What should you do?

Options:

A.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model weekly.


B.

Download the weather and flu data each month Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model monthly.


C.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model every month.


D.

Download the weather data each week, and download the flu data each month Deploy the model to a Vertex Al endpoint with feature drift monitoring. and retrain the model if a monitoring alert is detected.


Expert Solution
Questions # 33:

You work for a retail company that is using a regression model built with BigQuery ML to predict product sales. This model is being used to serve online predictions Recently you developed a new version of the model that uses a different architecture (custom model) Initial analysis revealed that both models are performing as expected You want to deploy the new version of the model to production and monitor the performance over the next two months You need to minimize the impact to the existing and future model users How should you deploy the model?

Options:

A.

Import the new model to the same Vertex Al Model Registry as a different version of the existing model. Deploy the new model to the same Vertex Al endpoint as the existing model, and use traffic splitting to route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model.


B.

Import the new model to the same Vertex Al Model Registry as the existing model Deploy the models to one Vertex Al endpoint Route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model


C.

Import the new model to the same Vertex Al Model Registry as the existing model Deploy each model to a separate Vertex Al endpoint.


D.

Deploy the new model to a separate Vertex Al endpoint Create a Cloud Run service that routes the prediction requests to the corresponding endpoints based on the input feature values.


Expert Solution
Questions # 34:

You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?

Options:

A.

Tokenize all of the fields using hashed dummy values to replace the real values.


B.

Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.


C.

Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into single precision. The other two fields are already as coarse as possible.


D.

Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.


Expert Solution
Questions # 35:

You are investigating the root cause of a misclassification error made by one of your models. You used Vertex Al Pipelines to tram and deploy the model. The pipeline reads data from BigQuery. creates a copy of the data in Cloud Storage in TFRecord format trains the model in Vertex Al Training on that copy, and deploys the model to a Vertex Al endpoint. You have identified the specific version of that model that misclassified: and you need to recover the data this model was trained on. How should you find that copy of the data'?

Options:

A.

Use Vertex Al Feature Store Modify the pipeline to use the feature store; and ensure that all training data is stored in it Search the feature store for the data used for the training.


B.

Use the lineage feature of Vertex Al Metadata to find the model artifact Determine the version of the model and identify the step that creates the data copy, and search in the metadata for its location.


C.

Use the logging features in the Vertex Al endpoint to determine the timestamp of the models deployment Find the pipeline run at that timestamp Identify the step that creates the data copy; and search in the logs for its location.


D.

Find the job ID in Vertex Al Training corresponding to the training for the model Search in the logs of that job for the data used for the training.


Expert Solution
Questions # 36:

You work for a company that provides an anti-spam service that flags and hides spam posts on social media platforms. Your company currently uses a list of 200,000 keywords to identify suspected spam posts. If a post contains more than a few of these keywords, the post is identified as spam. You want to start using machine learning to flag spam posts for human review. What is the main advantage of implementing machine learning for this business case?

Options:

A.

Posts can be compared to the keyword list much more quickly.


B.

New problematic phrases can be identified in spam posts.


C.

A much longer keyword list can be used to flag spam posts.


D.

Spam posts can be flagged using far fewer keywords.


Expert Solution
Questions # 37:

You are the lead ML engineer on a mission-critical project that involves analyzing massive datasets using Apache Spark. You need to establish a robust environment that allows your team to rapidly prototype Spark models using Jupyter notebooks. What is the fastest way to achieve this?

Options:

A.

Configure a Compute Engine instance with Spark and use Jupyter notebooks.


B.

Set up a Dataproc cluster with Spark and use Jupyter notebooks.


C.

Set up a Vertex AI Workbench instance with a Spark kernel.


D.

Use Colab Enterprise with a Spark kernel.


Expert Solution
Questions # 38:

You work as an ML engineer at a social media company, and you are developing a visual filter for users’ profile photos. This requires you to train an ML model to detect bounding boxes around human faces. You want to use this filter in your company’s iOS-based mobile phone application. You want to minimize code development and want the model to be optimized for inference on mobile phones. What should you do?

Options:

A.

Train a model using AutoML Vision and use the “export for Core ML” option.


B.

Train a model using AutoML Vision and use the “export for Coral” option.


C.

Train a model using AutoML Vision and use the “export for TensorFlow.js” option.


D.

Train a custom TensorFlow model and convert it to TensorFlow Lite (TFLite).


Expert Solution
Questions # 39:

You work for a bank and are building a random forest model for fraud detection. You have a dataset that

includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

Options:

A.

Write your data in TFRecords.


B.

Z-normalize all the numeric features.


C.

Oversample the fraudulent transaction 10 times.


D.

Use one-hot encoding on all categorical features.


Expert Solution
Questions # 40:

You are training a TensorFlow model on a structured data set with 100 billion records stored in several CSV files. You need to improve the input/output execution performance. What should you do?

Options:

A.

Load the data into BigQuery and read the data from BigQuery.


B.

Load the data into Cloud Bigtable, and read the data from Bigtable


C.

Convert the CSV files into shards of TFRecords, and store the data in Cloud Storage


D.

Convert the CSV files into shards of TFRecords, and store the data in the Hadoop Distributed File System (HDFS)


Expert Solution
Viewing page 4 out of 8 pages
Viewing questions 31-40 out of questions