Google Professional-Data-Engineer Exam Questions Free Practice Test

Viewing page 5 out of 7 pages

Viewing questions 41-50 out of questions

Questions # 41:

Which of the following statements about the Wide & Deep Learning model are true? (Select 2 answers.)

Options:

The wide model is used for memorization, while the deep model is used for generalization.

A good use for the wide and deep model is a recommender system.

The wide model is used for generalization, while the deep model is used for memorization.

A good use for the wide and deep model is a small-scale linear regression problem.

Expert Solution

Questions # 42:

If a dataset contains rows with individual people and columns for year of birth, country, and income, how many of the columns are continuous and how many are categorical?

Options:

1 continuous and 2 categorical

3 categorical

3 continuous

2 continuous and 1 categorical

Expert Solution

Questions # 43:

Which of the following is NOT a valid use case to select HDD (hard disk drives) as the storage for Google Cloud Bigtable?

Options:

You expect to store at least 10 TB of data.

You will mostly run batch workloads with scans and writes, rather than frequently executing random reads of a small number of rows.

You need to integrate with Google BigQuery.

You will not use the data to back a user-facing or latency-sensitive application.

Expert Solution

Questions # 44:

Your company’s on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration. What should you do?

Options:

Put the data into Google Cloud Storage.

Use preemptible virtual machines (VMs) for the Cloud Dataproc cluster.

Tune the Cloud Dataproc cluster so that there is just enough disk for all data.

Migrate some of the cold data into Google Cloud Storage, and keep only the hot data in Persistent Disk.

Expert Solution

Questions # 45:

What is the recommended action to do in order to switch between SSD and HDD storage for your Google Cloud Bigtable instance?

Options:

create a third instance and sync the data from the two storage types via batch jobs

export the data from the existing instance and import the data into a new instance

run parallel instances where one is HDD and the other is SDD

the selection is final and you must resume using the same storage type

Expert Solution

Questions # 46:

Your company is in a highly regulated industry. One of your requirements is to ensure individual users have access only to the minimum amount of information required to do their jobs. You want to enforce this requirement with Google BigQuery. Which three approaches can you take? (Choose three.)

Options:

Disable writes to certain tables.

Restrict access to tables by role.

Ensure that the data is encrypted at all times.

Restrict BigQuery API access to approved users.

Segregate data across multiple tables or databases.

Use Google Stackdriver Audit Logging to determine policy violations.

Expert Solution

Questions # 47:

Your company has hired a new data scientist who wants to perform complicated analyses across very large datasets stored in Google Cloud Storage and in a Cassandra cluster on Google Compute Engine. The scientist primarily wants to create labelled data sets for machine learning projects, along with some visualization tasks. She reports that her laptop is not powerful enough to perform her tasks and it is slowing her down. You want to help her perform her tasks. What should you do?

Options:

Run a local version of Jupiter on the laptop.

Grant the user access to Google Cloud Shell.

Host a visualization tool on a VM on Google Compute Engine.

Deploy Google Cloud Datalab to a virtual machine (VM) on Google Compute Engine.

Expert Solution

Questions # 48:

Your company uses a proprietary system to send inventory data every 6 hours to a data ingestion service in the cloud. Transmitted data includes a payload of several fields and the timestamp of the transmission. If there are any concerns about a transmission, the system re-transmits the data. How should you deduplicate the data most efficiency?

Options:

Assign global unique identifiers (GUID) to each data entry.

Compute the hash value of each data entry, and compare it with all historical data.

Store each data entry as the primary key in a separate database and apply an index.

Maintain a database table to store the hash value and other metadata for each data entry.

Expert Solution

Questions # 49:

Your startup has never implemented a formal security policy. Currently, everyone in the company has access to the datasets stored in Google BigQuery. Teams have freedom to use the service as they see fit, and they have not documented their use cases. You have been asked to secure the data warehouse. You need to discover what everyone is doing. What should you do first?

Options:

Use Google Stackdriver Audit Logs to review data access.

Get the identity and access management IIAM) policy of each table

Use Stackdriver Monitoring to see the usage of BigQuery query slots.

Use the Google Cloud Billing API to see what account the warehouse is being billed to.

Expert Solution

Questions # 50:

You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to manage infrastructure scaling.

Which Google database service should you use?

Options:

Cloud SQL

BigQuery

Cloud Bigtable

Cloud Datastore

Expert Solution

Viewing page 5 out of 7 pages

Viewing questions 41-50 out of questions

Pass the Google Google Cloud Certified Professional-Data-Engineer Questions and answers with CertsForce