Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Google Professional Machine Learning Engineer Professional-Machine-Learning-Engineer Question # 48 Topic 5 Discussion

Google Professional Machine Learning Engineer Professional-Machine-Learning-Engineer Question # 48 Topic 5 Discussion

Professional-Machine-Learning-Engineer Exam Topic 5 Question 48 Discussion:
Question #: 48
Topic #: 5

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?


A.

Weight pruning


B.

Dynamic range quantization


C.

Model distillation


D.

Dimensionality reduction


Get Premium Professional-Machine-Learning-Engineer Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.