Isaca ISACA Advanced in AI Audit (AAIA) AAIA Question # 58 Topic 6 Discussion
AAIA Exam Topic 6 Question 58 Discussion:
Question #: 58
Topic #: 6
An IS auditor reviewing the latest AI chatbot release identifies that, despite high accuracy rates, non-English users complain about the model ' s poor accuracy. Which of the following controls is BEST at ensuring detection of subgroup regressions?
A.
Weighted metric with higher accuracy targets
B.
Human review for non-English languages after go-live
C.
Translate all outputs to English and evaluate in English for consistency
D.
Comparative language evaluations with parity thresholds
A common pitfall in AI performance is " aggregate accuracy, " which masks poor performance in specific demographics or subgroups. To detect these " subgroup regressions, " auditors should look for " comparative evaluations " where the model is tested separately for different languages. " Parity thresholds " establish the maximum allowable difference in performance between the majority group (English) and minority groups. This ensures fairness and consistent user experience. Post-go-live human review is reactive, and translating everything to English ignores the nuances of the original language that likely caused the errors in the first place.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit