Winter Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 All Questions
Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 All Questions

View all questions & answers for the Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 exam

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question # 2 Topic 1 Discussion

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Topic 1 Question 2 Discussion:
Question #: 2
Topic #: 1

A data scientist is working on a large dataset in Apache Spark using PySpark. The data scientist has a DataFramedfwith columnsuser_id,product_id, andpurchase_amountand needs to perform some operations on this data efficiently.

Which sequence of operations results in transformations that require a shuffle followed by transformations that do not?


A.

df.filter(df.purchase_amount > 100).groupBy("user_id").sum("purchase_amount")


B.

df.withColumn("discount", df.purchase_amount * 0.1).select("discount")


C.

df.withColumn("purchase_date", current_date()).where("total_purchase > 50")


D.

df.groupBy("user_id").agg(sum("purchase_amount").alias("total_purchase")).repartition(10)


Get Premium Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.