Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Google Google Cloud Certified Professional-Data-Engineer Questions and answers with CertsForce

Viewing page 6 out of 8 pages
Viewing questions 51-60 out of questions
Questions # 51:

When creating a new Cloud Dataproc cluster with the projects.regions.clusters.create operation, these four values are required: project, region, name, and ____.

Options:

A.

zone


B.

node


C.

label


D.

type


Expert Solution
Questions # 52:

You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the “Trust No One” (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data. What should you do?

Options:

A.

Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.


B.

Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key. Use gsutil cp to upload each encrypted file to the Cloud Storage bucket. Manually destroy the key previously used for encryption, and rotate the key once and rotate the key once.


C.

Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in Cloud Memorystore as permanent storage of the secret.


D.

Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in a different project that only the security team can access.


Expert Solution
Questions # 53:

You are troubleshooting your Dataflow pipeline that processes data from Cloud Storage to BigQuery. You have discovered that the Dataflow worker nodes cannot communicate with one another Your networking team relies on Google Cloud network tags to define firewall rules You need to identify the issue while following Google-recommended networking security practices. What should you do?

Options:

A.

Determine whether your Dataflow pipeline has a custom network tag set.


B.

Determine whether there is a firewall rule set to allow traffic on TCP ports 12345 and 12346 for the Dataflow network tag.


C.

Determine whether your Dataflow pipeline is deployed with the external IP address option enabled.


D.

Determine whether there is a firewall rule set to allow traffic on TCP ports 12345 and 12346 on the subnet used by Dataflow workers.


Expert Solution
Questions # 54:

You work for a large financial institution that is planning to use Dialogflow to create a chatbot for the company's mobile app You have reviewed old chat logs and lagged each conversation for intent based on each customer's stated intention for contacting customer service About 70% of customer requests are simple requests that are solved within 10 intents The remaining 30% of inquiries require much longer, more complicated requests Which intents should you automate first?

Options:

A.

Automate the 10 intents that cover 70% of the requests so that live agents can handle more complicated requests


B.

Automate the more complicated requests first because those require more of the agents' time


C.

Automate a blend of the shortest and longest intents to be representative of all intents


D.

Automate intents in places where common words such as "payment" appear only once so the software isn't confused


Expert Solution
Questions # 55:

You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing some features while having a minimum effect on model accuracy. What can you do?

Options:

A.

Eliminate features that are highly correlated to the output labels.


B.

Combine highly co-dependent features into one representative feature.


C.

Instead of feeding in each feature individually, average their values in batches of 3.


D.

Remove the features that have null values for more than 50% of the training records.


Expert Solution
Questions # 56:

You monitor and optimize the BigQuery instance for your team. You notice that a particular daily report that uses a large JOIN operation is consistently slow. You want to examine the query's execution plan to identify potential performance bottlenecks within the JOIN as quickly as possible. What should you do?

Options:

A.

Review the BigQuery audit logs in Cloud Logging.


B.

Run a query on the INFORMATION_SCHEMA.JOBS_BY_PROJECT view filtering by the job_id and analyze total_bytes_processed.


C.

Leverage BigQuery's Query History view and analyze the execution graph.


D.

Use the bq query --dry_run command to review the estimated number of bytes read and review query syntax.


Expert Solution
Questions # 57:

Your company is streaming real-time sensor data from their factory floor into Bigtable and they have noticed extremely poor performance. How should the row key be redesigned to improve Bigtable performance on queries that populate real-time dashboards?

Options:

A.

Use a row key of the form .


B.

Use a row key of the form .


C.

Use a row key of the form #.


D.

Use a row key of the form >##.


Expert Solution
Questions # 58:

Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. During testing, you notice that some messages are missing in thedashboard. You check the logs, and all messages are being published to Cloud Pub/Sub successfully. What should you do next?

Options:

A.

Check the dashboard application to see if it is not displaying correctly.


B.

Run a fixed dataset through the Cloud Dataflow pipeline and analyze the output.


C.

Use Google Stackdriver Monitoring on Cloud Pub/Sub to find the missing messages.


D.

Switch Cloud Dataflow to pull messages from Cloud Pub/Sub instead of Cloud Pub/Sub pushing messages to Cloud Dataflow.


Expert Solution
Questions # 59:

You are deploying 10,000 new Internet of Things devices to collect temperature data in your warehouses globally. You need to process, store and analyze these very large datasets in real time. What should you do?

Options:

A.

Send the data to Google Cloud Datastore and then export to BigQuery.


B.

Send the data to Google Cloud Pub/Sub, stream Cloud Pub/Sub to Google Cloud Dataflow, and store the data in Google BigQuery.


C.

Send the data to Cloud Storage and then spin up an Apache Hadoop cluster as needed in Google Cloud Dataproc whenever analysis is required.


D.

Export logs in batch to Google Cloud Storage and then spin up a Google Cloud SQL instance, import the data from Cloud Storage, and run an analysis as needed.


Expert Solution
Questions # 60:

Your startup has never implemented a formal security policy. Currently, everyone in the company has access to the datasets stored in Google BigQuery. Teams have freedom to use the service as they see fit, and they have not documented their use cases. You have been asked to secure the data warehouse. You need to discover what everyone is doing. What should you do first?

Options:

A.

Use Google Stackdriver Audit Logs to review data access.


B.

Get the identity and access management IIAM) policy of each table


C.

Use Stackdriver Monitoring to see the usage of BigQuery query slots.


D.

Use the Google Cloud Billing API to see what account the warehouse is being billed to.


Expert Solution
Viewing page 6 out of 8 pages
Viewing questions 51-60 out of questions