Pre-Summer Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Exam Professional-Machine-Learning-Engineer All Questions
Exam Professional-Machine-Learning-Engineer All Questions

View all questions & answers for the Professional-Machine-Learning-Engineer exam

Google Machine Learning Engineer Professional-Machine-Learning-Engineer Question # 56 Topic 6 Discussion

Professional-Machine-Learning-Engineer Exam Topic 6 Question 56 Discussion:
Question #: 56
Topic #: 6

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?


A.

Weight pruning


B.

Dynamic range quantization


C.

Model distillation


D.

Dimensionality reduction


Get Premium Professional-Machine-Learning-Engineer Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.