Associate-Developer-Apache-Spark-3.5試験無料問題集「Databricks Certified Associate Developer for Apache Spark 3.5 - Python 認定」

A Spark application suffers from too many small tasks due to excessive partitioning. How can this be fixed without a full shuffle?
Options:

解説: (GoShiken メンバーにのみ表示されます)
A data engineer needs to write a DataFramedfto a Parquet file, partitioned by the columncountry, and overwrite any existing data at the destination path.
Which code should the data engineer use to accomplish this task in Apache Spark?

解説: (GoShiken メンバーにのみ表示されます)
What is the benefit of using Pandas on Spark for data transformations?
Options:

解説: (GoShiken メンバーにのみ表示されます)
A data engineer needs to persist a file-based data source to a specific location. However, by default, Spark writes to the warehouse directory (e.g., /user/hive/warehouse). To override this, the engineer must explicitly define the file path.
Which line of code ensures the data is saved to a specific location?
Options:

解説: (GoShiken メンバーにのみ表示されます)
A Spark developer is building an app to monitor task performance. They need to track the maximum task processing time per worker node and consolidate it on the driver for analysis.
Which technique should be used?

解説: (GoShiken メンバーにのみ表示されます)
An engineer has a large ORC file located at/file/test_data.orcand wants to read only specific columns to reduce memory usage.
Which code fragment will select the columns, i.e.,col1,col2, during the reading process?

解説: (GoShiken メンバーにのみ表示されます)
A developer runs:

What is the result?
Options:

解説: (GoShiken メンバーにのみ表示されます)
Given the code fragment:

import pyspark.pandas as ps
psdf = ps.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
Which method is used to convert a Pandas API on Spark DataFrame (pyspark.pandas.DataFrame) into a standard PySpark DataFrame (pyspark.sql.DataFrame)?

解説: (GoShiken メンバーにのみ表示されます)