New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Google Google Cloud Certified Professional-Data-Engineer Questions and answers with CertsForce

Viewing page 1 out of 6 pages
Viewing questions 1-10 out of questions
Questions # 1:

You work for an economic consulting firm that helps companies identify economic trends as they happen. As part of your analysis, you use Google BigQuery to correlate customer data with the average prices of the 100 most common goods sold, including bread, gasoline, milk, and others. The average prices of these goods are updated every 30 minutes. You want to make sure this data stays up to date so you can combine it with other data in BigQuery as cheaply as possible. What should you do?

Options:

A.

Load the data every 30 minutes into a new partitioned table in BigQuery.


B.

Store and update the data in a regional Google Cloud Storage bucket and create a federated data source in BigQuery


C.

Store the data in Google Cloud Datastore. Use Google Cloud Dataflow to query BigQuery and combine the data programmatically with the data stored in Cloud Datastore


D.

Store the data in a file in a regional Google Cloud Storage bucket. Use Cloud Dataflow to query BigQuery and combine the data programmatically with the data stored in Google Cloud Storage.


Expert Solution
Questions # 2:

You work for a manufacturing plant that batches application log files together into a single log file once a day at 2:00 AM. You have written a Google Cloud Dataflow job to process that log file. You need to make sure the log file in processed once per day as inexpensively as possible. What should you do?

Options:

A.

Change the processing job to use Google Cloud Dataproc instead.


B.

Manually start the Cloud Dataflow job each morning when you get into the office.


C.

Create a cron job with Google App Engine Cron Service to run the Cloud Dataflow job.


D.

Configure the Cloud Dataflow job as a streaming job so that it processes the log data immediately.


Expert Solution
Questions # 3:

You are designing the database schema for a machine learning-based food ordering service that will predict what users want to eat. Here is some of the information you need to store:

The user profile: What the user likes and doesn’t like to eat

The user account information: Name, address, preferred meal times

The order information: When orders are made, from where, to whom

The database will be used to store all the transactional data of the product. You want to optimize the data schema. Which Google Cloud Platform product should you use?

Options:

A.

BigQuery


B.

Cloud SQL


C.

Cloud Bigtable


D.

Cloud Datastore


Expert Solution
Questions # 4:

You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with multiple properties, some of which can take on multiple values. For example, in the entity ‘Movie’ the property ‘actors’ and the property ‘tags’ have multiple values but the property ‘date released’ does not. A typical query would ask for all movies with actor=<actorname> ordered by date_released or all movies with tag=Comedy ordered by date_released. How should you avoid a combinatorial explosion in the number of indexes?

Question # 4

Options:

A.

Option A


B.

Option B.


C.

Option C


D.

Option D


Expert Solution
Questions # 5:

How can you get a neural network to learn about relationships between categories in a categorical feature?

Options:

A.

Create a multi-hot column


B.

Create a one-hot column


C.

Create a hash bucket


D.

Create an embedding column


Expert Solution
Questions # 6:

The YARN ResourceManager and the HDFS NameNode interfaces are available on a Cloud Dataproc cluster ____.

Options:

A.

application node


B.

conditional node


C.

master node


D.

worker node


Expert Solution
Questions # 7:

Cloud Dataproc is a managed Apache Hadoop and Apache _____ service.

Options:

A.

Blaze


B.

Spark


C.

Fire


D.

Ignite


Expert Solution
Questions # 8:

Which of these is NOT a way to customize the software on Dataproc cluster instances?

Options:

A.

Set initialization actions


B.

Modify configuration files using cluster properties


C.

Configure the cluster using Cloud Deployment Manager


D.

Log into the master node and make changes from there


Expert Solution
Questions # 9:

You are planning to use Google's Dataflow SDK to analyze customer data such as displayed below. Your project requirement is to extract only the customer name from the data source and then write to an output PCollection.

Tom,555 X street

Tim,553 Y street

Sam, 111 Z street

Which operation is best suited for the above data processing requirement?

Options:

A.

ParDo


B.

Sink API


C.

Source API


D.

Data extraction


Expert Solution
Questions # 10:

Which is the preferred method to use to avoid hotspotting in time series data in Bigtable?

Options:

A.

Field promotion


B.

Randomization


C.

Salting


D.

Hashing


Expert Solution
Viewing page 1 out of 6 pages
Viewing questions 1-10 out of questions