Databricks-Certified-Data-Engineer-Associate試験無料問題集「Databricks Certified Data Engineer Associate 認定」

A data engineer has been given a new record of data:
id STRING = 'a1'
rank INTEGER = 6
rating FLOAT = 9.4
Which of the following SQL commands can be used to append the new record to an existing Delta table my_table?

解説: (GoShiken メンバーにのみ表示されます)
A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.
The table is configured to run in Production mode using the Continuous Pipeline Mode.
Assuming previously unprocessed data exists and all definitions are valid, what is the expected outcome after clicking Start to update the pipeline?

解説: (GoShiken メンバーにのみ表示されます)
A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.
Which of the following code blocks successfully completes this task?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following approaches should be used to send the Databricks Job owner an email in the case that the Job fails?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following can be used to simplify and unify siloed data architectures that are specialized for specific use cases?

解説: (GoShiken メンバーにのみ表示されます)
A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:
DROP TABLE IF EXISTS my_table;
After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.
Which of the following describes why all of these files were deleted?

解説: (GoShiken メンバーにのみ表示されます)
In order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing, which of the following two approaches is used by Spark to record the offset range of the data being processed in each trigger?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following benefits is provided by the array functions from Spark SQL?

解説: (GoShiken メンバーにのみ表示されます)
Which two components function in the DB platform architecture's control plane? (Choose two.)

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.
Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

解説: (GoShiken メンバーにのみ表示されます)