Google Professional-Data-Engineer Exam Questions Free Practice Test

Viewing page 1 out of 6 pages

Viewing questions 1-10 out of questions

Questions # 1:

You work for an economic consulting firm that helps companies identify economic trends as they happen. As part of your analysis, you use Google BigQuery to correlate customer data with the average prices of the 100 most common goods sold, including bread, gasoline, milk, and others. The average prices of these goods are updated every 30 minutes. You want to make sure this data stays up to date so you can combine it with other data in BigQuery as cheaply as possible. What should you do?

Options:

Load the data every 30 minutes into a new partitioned table in BigQuery.

Store and update the data in a regional Google Cloud Storage bucket and create a federated data source in BigQuery

Store the data in Google Cloud Datastore. Use Google Cloud Dataflow to query BigQuery and combine the data programmatically with the data stored in Cloud Datastore

Store the data in a file in a regional Google Cloud Storage bucket. Use Cloud Dataflow to query BigQuery and combine the data programmatically with the data stored in Google Cloud Storage.

Expert Solution

Questions # 2:

You work for a manufacturing plant that batches application log files together into a single log file once a day at 2:00 AM. You have written a Google Cloud Dataflow job to process that log file. You need to make sure the log file in processed once per day as inexpensively as possible. What should you do?

Options:

Change the processing job to use Google Cloud Dataproc instead.

Manually start the Cloud Dataflow job each morning when you get into the office.

Create a cron job with Google App Engine Cron Service to run the Cloud Dataflow job.

Configure the Cloud Dataflow job as a streaming job so that it processes the log data immediately.

Expert Solution

Questions # 3:

You are designing the database schema for a machine learning-based food ordering service that will predict what users want to eat. Here is some of the information you need to store:

The user profile: What the user likes and doesn’t like to eat

The user account information: Name, address, preferred meal times

The order information: When orders are made, from where, to whom

The database will be used to store all the transactional data of the product. You want to optimize the data schema. Which Google Cloud Platform product should you use?

Options:

BigQuery

Cloud SQL

Cloud Bigtable

Cloud Datastore

Expert Solution

Questions # 4:

You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with multiple properties, some of which can take on multiple values. For example, in the entity ‘Movie’ the property ‘actors’ and the property ‘tags’ have multiple values but the property ‘date released’ does not. A typical query would ask for all movies with actor=<actorname> ordered by date_released or all movies with tag=Comedy ordered by date_released. How should you avoid a combinatorial explosion in the number of indexes?

Question # 4

Options:

Option A

Option B.

Option C

Option D

Expert Solution

Questions # 5:

How can you get a neural network to learn about relationships between categories in a categorical feature?

Options:

Create a multi-hot column

Create a one-hot column

Create a hash bucket

Create an embedding column

Expert Solution

Questions # 6:

The YARN ResourceManager and the HDFS NameNode interfaces are available on a Cloud Dataproc cluster ____.

Options:

application node

conditional node

master node

worker node

Expert Solution

Questions # 7:

Cloud Dataproc is a managed Apache Hadoop and Apache _____ service.

Options:

Blaze

Spark

Fire

Ignite

Expert Solution

Questions # 8:

Which of these is NOT a way to customize the software on Dataproc cluster instances?

Options:

Set initialization actions

Modify configuration files using cluster properties

Configure the cluster using Cloud Deployment Manager

Log into the master node and make changes from there

Expert Solution

Questions # 9:

You are planning to use Google's Dataflow SDK to analyze customer data such as displayed below. Your project requirement is to extract only the customer name from the data source and then write to an output PCollection.

Tom,555 X street

Tim,553 Y street

Sam, 111 Z street

Which operation is best suited for the above data processing requirement?

Options:

ParDo

Sink API

Source API

Data extraction

Expert Solution

Questions # 10:

Which is the preferred method to use to avoid hotspotting in time series data in Bigtable?

Options:

Field promotion

Randomization

Salting

Hashing

Expert Solution

Viewing page 1 out of 6 pages

Viewing questions 1-10 out of questions

Pre-Winter Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Pass the Google Google Cloud Certified Professional-Data-Engineer Questions and answers with CertsForce