DP-203日本語PDFで合格させるスゴ問題集でDP-203日本語最新のリアル試験問題 [Q72-Q97]

DP-203日本語PDFで合格させるスゴ問題集でDP-203日本語最新のリアル試験問題

有効なDP-203日本語テスト解答DP-203日本語試験PDF問題を試そう

質問 72
Azure Active Directory (Azure AD) 統合を使用して Azure Data Lake Storage Gen2 に自動的に接続する Azure Databricks クラスターを実装する必要があります。
新しいクラスターをどのように構成する必要がありますか?回答するには、回答エリアで適切なオプションを選択してください。
注: 正しい選択ごとに 1 ポイントの価値があります。

正解:

解説:

References:
https://docs.azuredatabricks.net/spark/latest/data-sources/azure/adls-passthrough.html

質問 73
オンプレミスのデータソースと Azure Synapse Analytics を統合する必要があります。ソリューションは、データ統合の要件を満たす必要があります。
どのタイプの統合ランタイムを使用する必要がありますか?

A. セルフホステッド統合ランタイム
B. Azure-SSIS 統合ランタイム
C. Azure 統合ランタイム

正解: C

解説:
Topic 1, Contoso
Transactional Date
Contoso has three years of customer, transactional, operation, sourcing, and supplier data comprised of 10 billion records stored across multiple on-premises Microsoft SQL Server servers. The SQL server instances contain data from various operational systems. The data is loaded into the instances by using SQL server integration Services (SSIS) packages.
You estimate that combining all product sales transactions into a company-wide sales transactions dataset will result in a single table that contains 5 billion rows, with one row per transaction.
Most queries targeting the sales transactions data will be used to identify which products were sold in retail stores and which products were sold online during different time period. Sales transaction data that is older than three years will be removed monthly.
You plan to create a retail store table that will contain the address of each retail store. The table will be approximately 2 MB. Queries for retail store sales will include the retail store addresses.
You plan to create a promotional table that will contain a promotion ID. The promotion ID will be associated to a specific product. The product will be identified by a product ID. The table will be approximately 5 GB.
Streaming Twitter Data
The ecommerce department at Contoso develops and Azure logic app that captures trending Twitter feeds referencing the company's products and pushes the products to Azure Event Hubs.
Planned Changes
Contoso plans to implement the following changes:
* Load the sales transaction dataset to Azure Synapse Analytics.
* Integrate on-premises data stores with Azure Synapse Analytics by using SSIS packages.
* Use Azure Synapse Analytics to analyze Twitter feeds to assess customer sentiments about products.
Sales Transaction Dataset Requirements
Contoso identifies the following requirements for the sales transaction dataset:
* Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong: to the partition on the right.
* Ensure that queries joining and filtering sales transaction records based on product ID complete as quickly as possible.
* Implement a surrogate key to account for changes to the retail store addresses.
* Ensure that data storage costs and performance are predictable.
* Minimize how long it takes to remove old records.
Customer Sentiment Analytics Requirement
Contoso identifies the following requirements for customer sentiment analytics:
* Allow Contoso users to use PolyBase in an A/ure Synapse Analytics dedicated SQL pool to query the content of the data records that host the Twitter feeds. Data must be protected by using row-level security (RLS). The users must be authenticated by using their own A/ureAD credentials.
* Maximize the throughput of ingesting Twitter feeds from Event Hubs to Azure Storage without purchasing additional throughput or capacity units.
* Store Twitter feeds in Azure Storage by using Event Hubs Capture. The feeds will be converted into Parquet files.
* Ensure that the data store supports Azure AD-based access control down to the object level.
* Minimize administrative effort to maintain the Twitter feed data records.
* Purge Twitter feed data records;itftaitJ are older than two years.
Data Integration Requirements
Contoso identifies the following requirements for data integration:
Use an Azure service that leverages the existing SSIS packages to ingest on-premises data into datasets stored in a dedicated SQL pool of Azure Synaps Analytics and transform the data.
Identify a process to ensure that changes to the ingestion and transformation activities can be version controlled and developed independently by multiple data engineers.

質問 74
Azure Data Factory パイプラインを構築して、Azure Data Lake Storage Gen2 コンテナーから Azure Synapse Analytics 専用 SQL プール内のデータベースにデータを移動します。
コンテナ内のデータは、以下のフォルダ構造で保存されます。
/in/{YYYY}/{MM}/{DD}/{HH}/{mm}
最も古いフォルダーは /in/2021/01/01/00/00 です。最新のフォルダは /in/2021/01/15/01/45 です。
次の要件を満たすようにパイプライントリガーを構成する必要があります。
既存のデータをロードする必要があります。
データは 30 分ごとにロードする必要があります。
最大 2 分の遅延到着データは、データが到着するはずの時間の負荷に含める必要があります。
パイプライントリガーをどのように構成する必要がありますか?答えるには、答えで適切なオプションを選択します。
注: 正しい選択ごとに 1 ポイントの価値があります。

正解:

解説:

Reference:
https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-tumbling-window-trigger

質問 75
CSV ファイルからデータを取り込み、指定された種類のデータに列をキャストし、Azure Synapse Analytic 専用 SQL プールのテーブルにデータを挿入する Azure Data Factory データフローを作成しています。 CSV ファイルには、ユーザー名、コメント、および日付という 3 つの列が含まれています。
データフローにはすでに次のものが含まれています。
ソース変換。
適切なタイプのデータを設定するための派生列変換
a.
データをプールに入れるためのシンク変換。
データフローが次の要件を満たしていることを確認する必要があります。
すべての有効な行を宛先テーブルに書き込む必要があります。
コメント列の切り捨てエラーは、積極的に回避する必要があります。
挿入時に切り捨てエラーが発生するコメント値を含む行は、BLOB ストレージ内のファイルに書き込む必要があります。
どの2つのアクションを実行する必要がありますか?それぞれの正解は、ソリューションの一部を示しています。
注: 正しい選択ごとに 1 ポイントの価値があります。

A. データフローに、条件付き分割変換を追加して、切り捨てエラーの原因となる行を分離します。
B. 切り捨てエラーの原因となる行のみを選択する選択変換を追加します。
C. データフローに、BLOB ストレージ内のファイルに行を書き込むためのシンク変換を追加します。
D. データフローにフィルター変換を追加して、切り捨てエラーの原因となる行をフィルターで除外します。

正解: A,C

解説:
Explanation
B: Example:
1. This conditional split transformation defines the maximum length of "title" to be five. Any row that is less than or equal to five will go into the GoodRows stream. Any row that is larger than five will go into the BadRows stream.

2. This conditional split transformation defines the maximum length of "title" to be five. Any row that is less than or equal to five will go into the GoodRows stream. Any row that is larger than five will go into the BadRows stream.
A:
3. Now we need to log the rows that failed. Add a sink transformation to the BadRows stream for logging.
Here, we'll "auto-map" all of the fields so that we have logging of the complete transaction record. This is a text-delimited CSV file output to a single file in Blob Storage. We'll call the log file "badrows.csv".

4. The completed data flow is shown below. We are now able to split off error rows to avoid the SQL truncation errors and put those entries into a log file. Meanwhile, successful rows can continue to write to our target database.

Reference:
https://docs.microsoft.com/en-us/azure/data-factory/how-to-data-flow-error-rows

質問 76
Pool1 という Azure Synapse Analytics サーバーレス SQL プールと storage1 という Azure Data Lake Storage Gen2 アカウントがあります。 AllowedBlobpublicAccess ポータルは storage1 に対して無効になっています。
Azure Active Directory (Azure AD) ユーザーが Pool1 から storage1 にアクセスするために使用できる外部データソースを作成する必要があります。
最初に何を作成する必要がありますか？

A. 外部リソースプール
B. 外部ライブラリ
C. データベーススコープの資格情報
D. リモートサービスバインディング

正解: C

質問 77
Azure Stream Analyticsを使用して、Azure Event HubsからTwitterデータを受信し、そのデータをAzureBlobストレージアカウントに出力します。
毎分過去5分間のツイート数を出力する必要があります。
どのウィンドウ関数を使用する必要がありますか？

A. セッション
B. ホッピング
C. タンブリング
D. スライディング

正解: B

解説:
Hopping window functions hop forward in time by a fixed period. It may be easy to think of them as Tumbling windows that can overlap and be emitted more often than the window size. Events can belong to more than one Hopping window result set. To make a Hopping window the same as a Tumbling window, specify the hop size to be the same as the window size.
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions

質問 78
ある企業は、Platform-as-a-Service（PaaS）を使用して新しいデータパイプラインプロセスを作成することを計画しています。このプロセスは、次の要件を満たしている必要があります。
摂取：
複数のデータソースにアクセスします。
ワークフローを調整する機能を提供します。
SQL Server IntegrationServicesパッケージを実行する機能を提供します。
店：
ビッグデータワークロード向けにストレージを最適化します。
保存データの暗号化を提供します。
サイズ制限なしで操作できます。
準備とトレーニング：
探索と視覚化のための完全に管理されたインタラクティブなワークスペースを提供します。
R、SQL、Python、Scala、およびJavaでプログラミングする機能を提供します。
Azure ActiveDirectoryでシームレスなユーザー認証を提供します。
モデルとサーブ：
ネイティブの列型ストレージを実装します。
SQL言語のサポート
構造化ストリーミングのサポートを提供します。
データ統合パイプラインを構築する必要があります。
どのテクノロジーを使用する必要がありますか？回答するには、回答領域で適切なオプションを選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Explanation
Graphical user interface, application, table, email Description automatically generated

質問 79
Azure SynapseAnalytics専用のSQLプールで日付ディメンションテーブルを設計しています。日付ディメンションテーブルは、すべてのファクトテーブルで使用されます。
データの移動を最小限に抑えるために、どの配布タイプを推奨する必要がありますか？

A. ラウンドロビン
B. ハッシュ
C. 複製

正解: C

解説:
Explanation
A replicated table has a full copy of the table available on every Compute node. Queries run fast on replicated tables since joins on replicated tables don't require data movement. Replication requires extra storage, though, and isn't practical for large tables.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview

質問 80
次の表に示すユーザーを含むAzureSynapseAnalytics専用のSQLプールがあります。

User1はデータベースに対してクエリを実行し、クエリは次の図に示す結果を返します。

User1は、マスクされていないデータにアクセスできる唯一のユーザーです。
ドロップダウンメニューを使用して、図に示されている情報に基づいて各ステートメントを完了する回答の選択肢を選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Reference:
https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview

質問 81
第 3 正規形スキーマを使用する Microsoft SQL Server データベースがあります。
データベース内のデータを A?\ire Synapse Analytics 専用 SQI プールのスタースキーマに移行する予定です。
ディメンションテーブルを設計する必要があります。ソリューションは、読み取り操作を最適化する必要があります。
ソリューションには何を含める必要がありますか?回答するには、回答エリアで適切なオプションを選択します。
注: 正しい選択ごとに 1 ポイントの価値があります。

正解:

解説:

質問 82
Azure Data LakeStorageアカウントを含むAzureサブスクリプションがあります。ストレージアカウントには、DataLake1という名前のデータレイクが含まれています。
Azureデータファクトリを使用して、DataLake1のフォルダーからデータを取り込み、データを変換して、データを別のフォルダーに配置することを計画しています。
データファクトリがDataLake1ファイルシステムの任意のフォルダからデータを読み書きできることを確認する必要があります。ソリューションは、次の要件を満たす必要があります。
不正なユーザーアクセスのリスクを最小限に抑えます。
最小特権の原則を使用します。
メンテナンスの労力を最小限に抑えます。
データファクトリのストレージアカウントへのアクセスをどのように構成する必要がありますか？回答するには、回答領域で適切なオプションを選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Explanation
Text Description automatically generated with low confidence

Box 1: Azure Active Directory (Azure AD)
On Azure, managed identities eliminate the need for developers having to manage credentials by providing an identity for the Azure resource in Azure AD and using it to obtain Azure Active Directory (Azure AD) tokens.
Box 2: a managed identity
A data factory can be associated with a managed identity for Azure resources, which represents this specific data factory. You can directly use this managed identity for Data Lake Storage Gen2 authentication, similar to using your own service principal. It allows this designated factory to access and copy data to or from your Data Lake Storage Gen2.
Note: The Azure Data Lake Storage Gen2 connector supports the following authentication types.
* Account key authentication
* Service principal authentication
* Managed identities for Azure resources authentication
Reference:
https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview
https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-data-lake-storage

質問 83
Microsoft Azure SQL データウェアハウスの実装の監視を構成します。実装では、PolyBase を使用して、外部テーブルを使用して Azure Data Lake Gen 2 に格納されているコンマ区切り値 (CSV) ファイルからデータを読み込みます。
スキーマが無効なファイルはエラーの原因となります。
無効なスキーマエラーを監視する必要があります。
どのエラーを監視する必要がありますか?

A. 内部エラーのため、EXTERNAL TABLE アクセスに失敗しました: 'HdfsBridge_Connect の呼び出しで Java 例外が発生しました: エラー
[com.microsoft.polybase.client.KerberosSecureLogin] は、外部ファイルへのアクセス中に発生しました。
B. 内部エラーのため、EXTERNAL TABLE へのアクセスに失敗しました: 「HdfsBridge_Connect の呼び出しで Java 例外が発生しました: 外部ファイルへのアクセス中にエラー [LoginClass をインスタンス化できません] が発生しました。」
C. 内部エラーのため、EXTERNAL TABLE へのアクセスに失敗しました: 「HdfsBridge_Connect の呼び出しで Java 例外が発生しました: 外部ファイルへのアクセス中にエラー [スキームのファイルシステムがありません: wasbs] が発生しました。」
D. OLE DB プロバイダー "SQLNCLI11" に対してクエリ "リモートクエリ" を実行できません: リンクサーバー "(null)" の場合、クエリは中止されました - 外部ソースからの参照中に最大拒否しきい値 (o 行) に達しました: 1処理された合計 1 行のうち拒否された行。

正解: D

解説:
Customer Scenario:
SQL Server 2016 or SQL DW connected to Azure blob storage. The CREATE EXTERNAL TABLE DDL points to a directory (and not a specific file) and the directory contains files with different schemas.
SSMS Error:
Select query on the external table gives the following error:
Msg 7320, Level 16, State 110, Line 14
Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "(null)". Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed.
Possible Reason:
The reason this error happens is because each file has different schema. The PolyBase external table DDL when pointed to a directory recursively reads all the files in that directory. When a column or data type mismatch happens, this error could be seen in SSMS.
Possible Solution:
If the data for each table consists of one file, then use the filename in the LOCATION section prepended by the directory of the external files. If there are multiple files per table, put each set of files into different directories in Azure Blob Storage and then you can point LOCATION to the directory instead of a particular file. The latter suggestion is the best practices recommended by SQLCAT even if you have one file per table.
Incorrect Answers:
A: Possible Reason: Kerberos is not enabled in Hadoop Cluster.
References:
https://techcommunity.microsoft.com/t5/DataCAT/PolyBase-Setup-Errors-and-Possible-Solutions/ba-p/305297

質問 84
次の Azure Stream Analytics クエリがあります。

次の各ステートメントについて、ステートメントが true の場合は [はい] を選択します。それ以外の場合は、[いいえ] を選択します。
注: 正しい選択ごとに 1 ポイントの価値があります。

正解:

解説:

Explanation

Box 1: Yes
You can now use a new extension of Azure Stream Analytics SQL to specify the number of partitions of a stream when reshuffling the data.
The outcome is a stream that has the same partition scheme. Please see below for an example:
WITH step1 AS (SELECT * FROM [input1] PARTITION BY DeviceID INTO 10),
step2 AS (SELECT * FROM [input2] PARTITION BY DeviceID INTO 10)
SELECT * INTO [output] FROM step1 PARTITION BY DeviceID UNION step2 PARTITION BY DeviceID Note: The new extension of Azure Stream Analytics SQL includes a keyword INTO that allows you to specify the number of partitions for a stream when performing reshuffling using a PARTITION BY statement.
Box 2: Yes
When joining two streams of data explicitly repartitioned, these streams must have the same partition key and partition count.
Box 3: Yes
10 partitions x six SUs = 60 SUs is fine.
Note: Remember, Streaming Unit (SU) count, which is the unit of scale for Azure Stream Analytics, must be adjusted so the number of physical resources available to the job can fit the partitioned flow. In general, six SUs is a good number to assign to each partition. In case there are insufficient resources assigned to the job, the system will only apply the repartition if it benefits the job.
Reference:
https://azure.microsoft.com/en-in/blog/maximize-throughput-with-repartitioning-in-azure-stream-analytics/

質問 85
2020年上半期のトランザクションのファクトテーブルを含むAzureSynapseAnalytics専用のSQLプールを構築しています。
テーブルが次の要件を満たしていることを確認する必要があります。
10年より古いデータを削除するための処理時間を最小限に抑えます年から現在までの値を使用するクエリのI / Oを最小限に抑えますTransact-SQLステートメントをどのように完了する必要がありますか。回答するには、回答領域で適切なオプションを選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql

質問 86
Azure Databricks の対話型クラスターを設計しています。クラスターの使用頻度は低く、自動終了するように構成されます。
クラスターの終了後、クラスター構成が無期限に保持されることを確認する必要があります。ソリューションは、コストを最小限に抑える必要があります。
あなたは何をするべきか？

A. クラスターを固定します。
B. クラスタの終了後にクローンを作成します。
C. 処理が完了したら、クラスタを手動で終了します。
D. 90 日ごとにクラスターを開始する Azure Runbook を作成します。

正解: A

解説:
Explanation
To keep an interactive cluster configuration even after it has been terminated for more than 30 days, an administrator can pin a cluster to the cluster list.
References:
https://docs.azuredatabricks.net/clusters/clusters-manage.html#automatic-termination

質問 87
重要な顧客の連絡先情報を保護するために何を使用することをお勧めしますか?

A. 行レベルのセキュリティ
B. 透過的データ暗号化 (TDE)
C. データラベル
D. 列レベルのセキュリティ

正解: D

解説:
Scenario: All cloud data must be encrypted at rest and in transit.
Always Encrypted is a feature designed to protect sensitive data stored in specific database columns from access (for example, credit card numbers, national identification numbers, or data on a need to know basis). This includes database administrators or other privileged users who are authorized to access the database to perform management tasks, but have no business need to access the particular data in the encrypted columns. The data is always encrypted, which means the encrypted data is decrypted only for processing by client applications with access to the encryption key.
Reference:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-security-overview

質問 88
トランザクションデータの分析ストレージソリューションを設計する必要があります。ソリューションは、販売トランザクションデータセットの要件を満たす必要があります。
ソリューションに何を含める必要がありますか？回答するには、回答領域で適切なオプションを選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Explanation
Graphical user interface, text, application, table Description automatically generated

Box 1: Round-robin
Round-robin tables are useful for improving loading speed.
Scenario: Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month.
Box 2: Hash
Hash-distributed tables improve query performance on large fact tables.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribu

質問 89
あなたは、企業向けのデータエンジニアリングソリューションを開発しています。
プロジェクトでは、データを Azure Data Lake Storage にデプロイする必要があります。
プロジェクトメンバーが Azure Data Lake Storage リソースを管理できるように、ロールベースのアクセス制御 (RBAC) を実装する必要があります。
どの3つのアクションを実行する必要がありますか?それぞれの正解は、ソリューションの一部を示しています。
注: 正しい選択ごとに 1 ポイントの価値があります。

A. Azure Data Lake Storage アカウントのアクセス制御リスト (ACL) を構成します。
B. Azure Data Lake Storage アカウントのサービス間認証を構成します。
C. Azure AD セキュリティグループを Azure Data Lake Storage に割り当てます。
D. Azure Active Directory (Azure AD) にセキュリティグループを作成し、プロジェクトメンバーを追加します。
E. Azure Data Lake Storage アカウントのエンドユーザー認証を構成します。

正解: A,C,D

解説:
Reference:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-secure-data

質問 90
次のコードセグメントは、AzureDatabricksクラスターを作成するために使用されます。

次の各ステートメントについて、ステートメントがtrueの場合は、[はい]を選択します。それ以外の場合は、[いいえ]を選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Reference:
https://adatis.co.uk/databricks-cluster-sizing/
https://docs.microsoft.com/en-us/azure/databricks/jobs
https://docs.databricks.com/administration-guide/capacity-planning/cmbp.html
https://docs.databricks.com/delta/index.html

質問 91
AzureSynapseにSQLプールがあります。
AzureBlobストレージからステージングテーブルにデータをロードすることを計画しています。毎日約100万行のデータが読み込まれます。テーブルは、毎日のロードの前に切り捨てられます。
ステージングテーブルを作成する必要があります。このソリューションでは、データをステージングテーブルにロードするのにかかる時間を最小限に抑える必要があります。
テーブルをどのように構成する必要がありますか？回答するには、回答領域で適切なオプションを選択します。
注：正しい選択はそれぞれ1ポイントの価値があります。

正解:

解説:

Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute

質問 92
次の Azure Stream Analytics クエリがあります。

次の各ステートメントについて、ステートメントが true の場合は [はい] を選択します。それ以外の場合は、[いいえ] を選択します。
注: 正しい選択ごとに 1 ポイントの価値があります。

正解:

解説:

Reference:
https://azure.microsoft.com/en-in/blog/maximize-throughput-with-repartitioning-in-azure-stream-analytics/

質問 93
Server1という名前の論理MicrosoftSQLサーバー上にPool1という名前のAzureSynapse AnalyticsSQLプールがあります。
key1という名前のカスタムキーを使用して、Pool1に透過的データ暗号化（TDE）を実装する必要があります。
どの5つのアクションを順番に実行する必要がありますか？回答するには、適切なアクションをアクションのリストから回答領域に移動し、正しい順序で配置します。

正解:

解説:

Explanation
Graphical user interface, text, application Description automatically generated

Step 1: Assign a managed identity to Server1
You will need an existing Managed Instance as a prerequisite.
Step 2: Create an Azure key vault and grant the managed identity permissions to the vault Create Resource and setup Azure Key Vault.
Step 3: Add key1 to the Azure key vault
The recommended way is to import an existing key from a .pfx file or get an existing key from the vault.
Alternatively, generate a new key directly in Azure Key Vault.
Step 4: Configure key1 as the TDE protector for Server1
Provide TDE Protector key
Step 5: Enable TDE on Pool1
Reference:
https://docs.microsoft.com/en-us/azure/azure-sql/managed-instance/scripts/transparent-data-encryption-byok-pow

質問 94
Azure Databricks クラスターを作成し、インストールする追加のライブラリを指定します。
ライブラリをノートブックにロードしようとすると、ライブラリが見つかりません。
問題の原因を特定する必要があります。
何を見直すべきですか？

A. グローバルな init スクリプトのログ
B. クラスタイベントログ
C. ワークスペースログ
D. ノートブックのログ

正解: A

解説:
Cluster-scoped Init Scripts: Init scripts are shell scripts that run during the startup of each cluster node before the Spark driver or worker JVM starts. Databricks customers use init scripts for various purposes such as installing custom libraries, launching background processes, or applying enterprise security policies.
Logs for Cluster-scoped init scripts are now more consistent with Cluster Log Delivery and can be found in the same root folder as driver and executor logs for the cluster.
Reference:
https://databricks.com/blog/2018/08/30/introducing-cluster-scoped-init-scripts.html

質問 95
Azure Databricks テーブルを設計しています。テーブルは、1 日あたり平均 2,000 万のストリーミングイベントを取り込みます。
Azure Databricks の増分ロードパイプラインジョブで使用するために、イベントをテーブルに保持する必要があります。ソリューションは、ストレージコストと増分ロード時間を最小限に抑える必要があります。
ソリューションには何を含める必要がありますか?

A. 物理データストレージに JSON 形式を使用します。
B. 透かしの列を含めます。
C. DateTime フィールドによるパーティション。
D. Azure Queue Storage にシンクします。

正解: C

解説:
The Databricks ABS-AQS connector uses Azure Queue Storage (AQS) to provide an optimized file source that lets you find new files written to an Azure Blob storage (ABS) container without repeatedly listing all of the files.
This provides two major advantages:
Lower latency: no need to list nested directory structures on ABS, which is slow and resource intensive.
Lower costs: no more costly LIST API requests made to ABS.
Reference:
https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/aqs

質問 96
DW1という名前のAzureSynapseAnalyticsのエンタープライズデータウェアハウスを含むAzureデータソリューションがあります。
複数のユーザーがDW1へのアドホッククエリを同時に実行します。
DW1への自動データロードを定期的に実行します。
自動化されたデータロードに、アドホッククエリの実行時に迅速かつ正常に完了するのに十分なメモリが用意されていることを確認する必要があります。
あなたは何をするべきか？

A. DW1の各テーブルのすべての列のサンプル統計を作成します。
B. 自動データロードを実行する前に、DW1で大きなファクトテーブルをハッシュ分散します。
C. 自動化されたデータロードクエリに、より小さなリソースクラスを割り当てます。
D. 自動化されたデータロードクエリにより大きなリソースクラスを割り当てます。

正解: D

解説:
The performance capacity of a query is determined by the user's resource class. Resource classes are pre-determined resource limits in Synapse SQL pool that govern compute resources and concurrency for query execution.
Resource classes can help you configure resources for your queries by setting limits on the number of queries that run concurrently and on the compute-resources assigned to each query. There's a trade-off between memory and concurrency.
Smaller resource classes reduce the maximum memory per query, but increase concurrency.
Larger resource classes increase the maximum memory per query, but reduce concurrency.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/resource-classes-for-workload-management

質問 97
......

DP-203日本語問題集はあなたの合格を必ず保証します：https://www.goshiken.com/Microsoft/DP-203J-mondaishu.html

関するブログ

もっと

DP-203J無料問題集