A data scientist is working with a data set that has ten predictors and wants to use only the predictors that most influence the results. Which of the following models would be the best for the data scientist to use?
→ LASSO (Least Absolute Shrinkage and Selection Operator) regression performs both variable selection and regularization by adding an L1 penalty to the loss function. It shrinks less important feature coefficients to zero, effectively performing feature selection — perfect for identifying the most influential predictors.
Why the other options are incorrect:
A: OLS uses all predictors and doesn’t perform feature selection.
B: Ridge regression applies an L2 penalty, shrinking coefficients but keeping all predictors.
C: Weighted least squares adjusts for heteroscedasticity but doesn’t reduce variable count.
Official References:
CompTIA DataX (DY0-001) Study Guide – Section 3.3:“LASSO performs feature selection by zeroing out coefficients of less significant predictors.”
Statistical Learning Textbook, Chapter 6:“LASSO regression is ideal when model interpretability and variable reduction are important.”
—
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit