Pre-Summer Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Exam Databricks-Machine-Learning-Associate All Questions
Exam Databricks-Machine-Learning-Associate All Questions

View all questions & answers for the Databricks-Machine-Learning-Associate exam

Databricks ML Data Scientist Databricks-Machine-Learning-Associate Question # 20 Topic 3 Discussion

Databricks-Machine-Learning-Associate Exam Topic 3 Question 20 Discussion:
Question #: 20
Topic #: 3

A data scientist has developed a random forest regressor rfr and included it as the final stage in a Spark MLPipeline pipeline. They then set up a cross-validation process with pipeline as the estimator in the following code block:

Databricks-Machine-Learning-Associate Question 20

Which of the following is a negative consequence of includingpipelineas the estimator in the cross-validation process rather thanrfras the estimator?


A.

The process will have a longer runtime because all stages of pipeline need to be refit or retransformed with each mode


B.

The process will leak data from the training set to the test set during the evaluation phase


C.

The process will be unable to parallelize tuning due to the distributed nature of pipeline


D.

The process will leak data prep information from the validation sets to the training sets for each model


Get Premium Databricks-Machine-Learning-Associate Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.