New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Databricks Databricks Certification Databricks-Certified-Data-Engineer-Associate Questions and answers with CertsForce

Viewing page 3 out of 5 pages
Viewing questions 21-30 out of questions
Questions # 21:

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

Options:

A.

PIVOT


B.

CONVERT


C.

WHERE


D.

TRANSFORM


E.

SUM


Expert Solution
Questions # 22:

What Databricks feature can be used to check the data sources and tables used in a workspace?

Options:

A.

Do not use the lineage feature as it only tracks activity from the last 3 months and will not provide full details on dependencies.


B.

Use the lineage feature to visualize a graph that highlights where the table is used only in notebooks,


C.

Use the lineage feature to visualize a graph that highlights where the table is used only in reports.


D.

Use the lineage feature to visualize a graph that shows all dependencies, including where the table is used in notebooks, other tables, and reports.


Expert Solution
Questions # 23:

A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location "/transactions/raw".

Today, the data engineer runs the following command to complete this task:

Question # 23

After running the command today, the data engineer notices that the number of records in table transactions has not changed.

Which of the following describes why the statement might not have copied any new records into the table?

Options:

A.

The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.


B.

The names of the files to be copied were not included with the FILES keyword.


C.

The previous day’s file has already been copied into the table.


D.

The PARQUET file format does not support COPY INTO.


E.

The COPY INTO statement requires the table to be refreshed to view the copied rows.


Expert Solution
Questions # 24:

A data engineer streams customer orders into a Kafka topic (orders_topic) and is currently writing the ingestion script of a DLT pipeline. The data engineer needs to ingest the data from Kafka brokers to DLT using Databricks

What is the correct code for ingesting the data?

A)

Question # 24

B)

Question # 24

C)

Question # 24

D)

Question # 24

Options:

A.

Option A


B.

Option B


C.

Option C


D.

Option D


Expert Solution
Questions # 25:

The Delta transaction log for the ‘students’ tables is shown using the ‘DESCRIBE HISTORY students’ command. A Data Engineer needs to query the table as it existed before the UPDATE operation listed in the log.

Question # 25

Which command should the Data Engineer use to achieve this? (Choose two.)

Options:

A.

SELECT * FROM students@v4


B.

SELECT * FROM students TIMESTAMP AS OF ‘2024-04-22T 14:32:47.000+00:00’


C.

SELECT * FROM students FROM HISTORY VERSION AS OF 3


D.

SELECT * FROM students VERSION AS OF 5


E.

SELECT * FROM students TIMESTAMP AS OF ‘2024-04-22T 14:32:58.000+00:00’


Expert Solution
Questions # 26:

In which of the following scenarios should a data engineer use the MERGE INTO command instead of the INSERT INTO command?

Options:

A.

When the location of the data needs to be changed


B.

When the target table is an external table


C.

When the source table can be deleted


D.

When the target table cannot contain duplicate records


E.

When the source is not a Delta table


Expert Solution
Questions # 27:

Which SQL code snippet will correctly demonstrate a Data Definition Language (DDL) operation used to create a table?

Options:

A.

DROP TABLE employees;


B.

INSERT INTO employees (id, name) VALUES (1, 'Alice');


C.

CRFATF tabif employees ( id INT, name suing


D.

ALTFR TABIF employees add column salary DECTMA(10,2);


Expert Solution
Questions # 28:

A data engineer has joined an existing project and they see the following query in the project repository:

CREATE STREAMING LIVE TABLE loyal_customers AS

SELECT customer_id -

FROM STREAM(LIVE.customers)

WHERE loyalty_level = 'high';

Which of the following describes why the STREAM function is included in the query?

Options:

A.

The STREAM function is not needed and will cause an error.


B.

The table being created is a live table.


C.

The customers table is a streaming live table.


D.

The customers table is a reference to a Structured Streaming query on a PySpark DataFrame.


E.

The data in the customers table has been updated since its last run.


Expert Solution
Questions # 29:

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.

Which of the following approaches can the data engineer take to identify the table that is dropping the records?

Options:

A.

They can set up separate expectations for each table when developing their DLT pipeline.


B.

They cannot determine which table is dropping the records.


C.

They can set up DLT to notify them via email when records are dropped.


D.

They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.


E.

They can navigate to the DLT pipeline page, click on the “Error” button, and review the present errors.


Expert Solution
Questions # 30:

Which of the following describes the relationship between Bronze tables and raw data?

Options:

A.

Bronze tables contain less data than raw data files.


B.

Bronze tables contain more truthful data than raw data.


C.

Bronze tables contain aggregates while raw data is unaggregated.


D.

Bronze tables contain a less refined view of data than raw data.


E.

Bronze tables contain raw data with a schema applied.


Expert Solution
Viewing page 3 out of 5 pages
Viewing questions 21-30 out of questions