Training Data Statistics:Watson OpenScale needs to understand the characteristics of the data the model was trained on. This includes things like the distribution of features, sensitive attributes (for fairness monitoring), and how the model performed on this initial data. These "training data statistics" are crucial for:
Fairness Configuration:Recommending fairness attributes, reference, and monitored groups.
Bias Detection:Calculating fairness metrics (like disparate impact) by comparing runtime behavior to the learned training data distribution.
Explainability:Generating explanations by understanding the distribution of values in the training data to create meaningful perturbations.
Drift Detection:Building a drift detection model that compares runtime data to the training data to identify shifts.
While Watson OpenScale also consumespayload data(the data sent to the deployed model for predictions) at runtime to calculate various metrics and perform monitoring, the initial setup and the ability to generate meaningfulstatisticsfor things like fairness and drift fundamentally rely on understanding thetraining data
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit