DEA-C02試験無料問題集「Snowflake SnowPro Advanced: Data Engineer (DEA-C02) 認定」
You are tasked with optimizing a Snowpipe Streaming pipeline that ingests data from Kafka into a Snowflake table named 'ORDERS' You notice that while the Kafka topic has high throughput, the data ingestion into Snowflake is lagging. The pipe definition is as follows: "sql CREATE OR REPLACE PIPE ORDERS_PIPEAS COPY INTO ORDERS FROM @KAFKA STAGE FILE_FORMAT = (TYPE = JSON); Which of the following actions, taken individually, would be MOST effective in improving the ingestion rate, assuming sufficient compute resources are available in your Snowflake virtual warehouse?
正解:D
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are tasked with managing a large Snowflake table called 'TRANSACTIONS'. Due to compliance requirements, you need to archive data older than one year to long-term storage (AWS S3) while ensuring the queries against the current 'TRANSACTIONS' table remain performant. What is the MOST efficient strategy using Snowflake features and considering minimal impact on query performance?
正解:D
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are developing a Snowpark Python application that needs to process data from a Kafka topic. The data is structured as Avro records. You want to leverage Snowpipe for ingestion and Snowpark DataFrames for transformation. What is the MOST efficient and scalable approach to integrate these components?
正解:C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are responsible for monitoring the performance of a Snowflake data pipeline that loads data from S3 into a Snowflake table named 'SALES DATA. You notice that the COPY INTO command consistently takes longer than expected. You want to implement telemetry to proactively identify the root cause of the performance degradation. Which of the following methods, used together, provide the MOST comprehensive telemetry data for troubleshooting the COPY INTO performance?
正解:A,C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A critical database, 'PRODUCTION DB', in your Snowflake account was accidentally dropped. You need to restore it as quickly as possible, but you're unsure if Time Travel retention is sufficient. Which method guarantees restoration of the database even if it falls outside the Time Travel window?
正解:A
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A Snowflake data warehouse contains a table 'WEB EVENTS' with columns like 'EVENT ID', 'EVENT TIMESTAMP, 'USER , 'PAGE URL', and 'SESSION ID'. The data engineering team has enabled search optimization on 'PAGE URL' because analysts frequently filter on specific URLs. However, they notice that queries filtering on multiple 'PAGE URL' values (e.g., using 'WHERE PAGE URL IN ('urll', 'ur12', are not performing as well as expected. What are the potential reasons for this behavior, and what strategies can be used to improve performance in this scenario? Select all that apply:
正解:B,C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are building a data pipeline that extracts data from a REST API, transforms it using Pandas DataFrames, and loads it into Snowflake. You need to implement error handling to gracefully handle network issues and API rate limits. Which of the following code snippets demonstrates the most robust approach to handle potential errors during data loading into Snowflake using the Python connector?


正解:C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are tasked with implementing a projection policy in Snowflake to restrict access to certain columns of the 'EMPLOYEE table based on the user's role. The table contains columns like 'EMPLOYEE 'NAME, 'SALARY', and 'DEPARTMENT. Users with the 'HR MANAGER role should have access to all columns, while other users should only be able to see 'EMPLOYEE ID, 'NAME, and DEPARTMENT. The initial attempt to create the projection policy results in an error. What could be the reasons?
正解:A,C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You're tasked with building a data pipeline using Snowpark Python to incrementally load data into a target table 'SALES SUMMARY from a source table 'RAW SALES. The pipeline needs to ensure that only new or updated records from 'RAW SALES are merged into 'SALES SUMMARY' based on a 'TRANSACTION ID'. You want to use Snowpark's 'MERGE' operation for this, but you also need to handle potential conflicts and log any rejected records to an error table 'SALES SUMMARY ERRORS'. Which of the following approaches offers the MOST robust and efficient solution for handling errors and ensuring data integrity within the MERGE statement?
正解:B
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A data engineer is using the Snowflake Spark connector to read a large table from Snowflake into a Spark DataFrame. The table contains a 'TIMESTAMP NTT column. After loading the data, the engineer observes that the values in the 'TIMESTAMP NTZ' column are not preserved accurately when retrieved from the DataFrame. What are the potential issues and what configurations can be adjusted in Snowflake to improve the result?


正解:B,E
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A data engineer is tasked with creating a Listing to share a large dataset stored in Snowflake. The dataset contains sensitive Personally Identifiable Information (PII) that must be masked for certain consumer roles. The data engineer wants to use Snowflake's dynamic data masking policies within the Listing to achieve this. Which of the following approaches is the MOST secure and maintainable way to implement this requirement, assuming that the consumer roles are pre-defined and known?
正解:C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are designing a Snowflake data pipeline that continuously ingests clickstream dat a. You need to monitor the pipeline for latency and throughput, and trigger notifications if these metrics fall outside acceptable ranges. Which of the following combinations of Snowflake features and techniques would be MOST effective for achieving this goal?
正解:B,E
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are tasked with building a data pipeline to process image metadata stored in JSON format from a series of URLs. The JSON structure contains fields such as 'image_url', 'resolution', 'camera_model', and 'location' (latitude and longitude). Your goal is to create a Snowflake table that stores this metadata along with a thumbnail of each image. Given the constraints that you want to avoid downloading and storing the images directly in Snowflake, and that Snowflake's native functions for image processing are limited, which of the following approaches would be most efficient and scalable?
正解:A,D
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are developing a Snowpark Python application that reads data from a large Snowflake table, performs several transformations, and then writes the results back to a new table. You notice that the write operation is taking significantly longer than the read and transformation steps. The target table is not clustered. Which of the following actions, either individually or in combination, would likely improve the write performance most significantly ?
正解:A
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
You are designing a data pipeline using Snowpipe to ingest data from multiple S3 buckets into a single Snowflake table. Each S3 bucket represents a different data source and contains files in JSON format. You want to use Snowpipe's auto-ingest feature and a single Snowpipe object for all buckets to simplify management and reduce overhead. However, each data source has a different JSON schem a. How can you best achieve this goal while ensuring data is loaded correctly and efficiently into the target table?
正解:B
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)