Databricks-Certified-Data-Engineer-Associate試験無料問題集（111題）「Databricks Certified Data Engineer Associate 認定」

出題：1

A data engineer has been given a new record of data:
id STRING = 'a1'
rank INTEGER = 6
rating FLOAT = 9.4
Which of the following SQL commands can be used to append the new record to an existing Delta table my_table?

A. UPDATE my_table VALUES ('a1', 6, 9.4)

B. my_table UNION VALUES ('a1', 6, 9.4)

C. UPDATE VALUES ('a1', 6, 9.4) my_table

D. INSERT INTO my_table VALUES ('a1', 6, 9.4)

E. INSERT VALUES ( 'a1' , 6, 9.4) INTO my_table

正解：D 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：2

A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.
The table is configured to run in Production mode using the Continuous Pipeline Mode.
Assuming previously unprocessed data exists and all definitions are valid, what is the expected outcome after clicking Start to update the pipeline?

A. All datasets will be updated once and the pipeline will shut down. The compute resources will be terminated.

B. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will be deployed for the update and terminated when the pipeline is stopped.

C. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

D. All datasets will be updated once and the pipeline will persist without any processing. The compute resources will persist but go unused.

E. All datasets will be updated once and the pipeline will shut down. The compute resources will persist to allow for additional testing.

正解：B 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：3

A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.
Which of the following code blocks successfully completes this task?

A. Option E

B. Option D

C. Option C

D. Option A

E. Option B

正解：D 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：4

Which of the following approaches should be used to send the Databricks Job owner an email in the case that the Job fails?

A. There is no way to notify the Job owner in the case of Job failure

B. Manually programming in an alert system in each cell of the Notebook

C. Setting up an Alert in the Job page

D. MLflow Model Registry Webhooks

E. Setting up an Alert in the Notebook

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：5

Which of the following can be used to simplify and unify siloed data architectures that are specialized for specific use cases?

A. None of these

B. Data warehouse

C. Data lake

D. All of these

E. Data lakehouse

正解：E 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：6

A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:
DROP TABLE IF EXISTS my_table;
After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.
Which of the following describes why all of these files were deleted?

A. The table's data was larger than 10 GB

B. The table was external

C. The table was managed

D. The table did not have a location

E. The table's data was smaller than 10 GB

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：7

In order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing, which of the following two approaches is used by Spark to record the offset range of the data being processed in each trigger?

A. Checkpointing and Idempotent Sinks

B. Write-ahead Logs and Idempotent Sinks

C. Checkpointing and Write-ahead Logs

D. Structured Streaming cannot record the offset range of the data being processed in each trigger.

E. Replayable Sources and Idempotent Sinks

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：8

Which of the following benefits is provided by the array functions from Spark SQL?

A. An ability to work with an array of tables for procedural automation

B. An ability to work with data in a variety of types at once

C. An ability to work with complex, nested data ingested from JSON files

D. An ability to work with data within certain partitions and windows

E. An ability to work with time-related data in specified intervals

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：9

Which two components function in the DB platform architecture's control plane? (Choose two.)

A. Unity Catalog

B. Serverless Compute

C. Compute

D. Virtual Machines

E. Compute Orchestration

正解：A,E 解答を投票する

出題：10

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.
Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

A. Databricks Repos allows users to revert to previous versions of a notebook

B. Databricks Repos automatically saves development progress

C. Databricks Repos provides the ability to comment on specific changes

D. Databricks Repos is wholly housed within the Databricks Lakehouse Platform

E. Databricks Repos supports the use of multiple branches

正解：E 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

Databricks-Certified-Data-Engineer-Associate試験無料問題集「Databricks Certified Data Engineer Associate 認定」