Databricks Certified Generative AI Engineer Associate Databricks-Generative-AI-Engineer-Associate Question # 2 Topic 1 Discussion

Databricks Certified Generative AI Engineer Associate Databricks-Generative-AI-Engineer-Associate Question # 2 Topic 1 Discussion

Databricks-Generative-AI-Engineer-Associate Exam Topic 1 Question 2 Discussion:
Question #: 2
Topic #: 1

A Generative Al Engineer has built an LLM-based system that will automatically translate user text between two languages. They now want to benchmark multiple LLM's on this task and pick the best one. They have an evaluation set with known high quality translation examples. They want to evaluate each LLM using the evaluation set with a performant metric.

Which metric should they choose for this evaluation?


A.

ROUGE metric


B.

BLEU metric


C.

NDCG metric


D.

RECALL metric


Get Premium Databricks-Generative-AI-Engineer-Associate Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.