Associate-Developer-Apache-Spark-3.5試験無料問題集「Databricks Certified Associate Developer for Apache Spark 3.5 - Python 認定」
Given a CSV file with the content:

And the following code:
from pyspark.sql.types import *
schema = StructType([
StructField("name", StringType()),
StructField("age", IntegerType())
])
spark.read.schema(schema).csv(path).collect()
What is the resulting output?

And the following code:
from pyspark.sql.types import *
schema = StructType([
StructField("name", StringType()),
StructField("age", IntegerType())
])
spark.read.schema(schema).csv(path).collect()
What is the resulting output?
正解:D
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A data engineer is working on a real-time analytics pipeline using Apache Spark Structured Streaming. The engineer wants to process incoming data and ensure that triggers control when the query is executed. The system needs to process data in micro-batches with a fixed interval of 5 seconds.
Which code snippet the data engineer could use to fulfil this requirement?
A)

B)

C)

D)

Options:
Which code snippet the data engineer could use to fulfil this requirement?
A)

B)

C)

D)

Options:
正解:B
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)
A developer is trying to join two tables,sales.purchases_fctandsales.customer_dim, using the following code:

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid')) The developer has discovered that customers in thepurchases_fcttable that do not exist in thecustomer_dimtable are being dropped from the joined table.
Which change should be made to the code to stop these customer records from being dropped?

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid')) The developer has discovered that customers in thepurchases_fcttable that do not exist in thecustomer_dimtable are being dropped from the joined table.
Which change should be made to the code to stop these customer records from being dropped?
正解:C
解答を投票する
解説: (GoShiken メンバーにのみ表示されます)