The percentage of missing values (option B) directly reflects data collection issues. Missing or incomplete data can degrade model performance, distort feature distributions, and create biased or inaccurate predictions.
AAIA stresses that auditors must evaluate:
Completeness
Validity
Accuracy
Consistency
Missing values signal failures in upstream processes, including sensors, user inputs, integrations, or data pipelines.
The other metrics are unrelated to raw data integrity:
Epochs (A) refer to training cycles.
Percentage of training data (C) concerns dataset partitioning, not quality.
True positives (D) relate to model performance, not data collection quality.
[References:, AAIA Domain 2: Data Quality, Completeness, and Integrity, AAIA Domain 3: Pre-Training Data Validation, , ]
Submit