A data scientist selects publicly available demographic data from a limited group of familiar zip codes when training an algorithm. What are two ethical concerns with this approach? Choose 2 answers.
The data scientist's approach raises two key ethical concerns:
Proxy Discrimination (C) – Using demographic data from limited zip codes may create biased AI models that discriminate against certain racial, economic, or social groups. Zip codes often reflect historical segregation, leading to unfair predictions.
Small Sample Size (D) – Training an AI model on a limited set of familiar zip codes introduces selection bias, reducing the model's ability to generalize across diverse populations. This affects fairness and accuracy.
Algorithmic Fairness & Bias (FAT/ML Guidelines) – AI models must avoid using biased proxies and ensure diverse training data.
Privacy & Ethical AI (GDPR, CCPA) – AI decisions must not indirectly discriminate based on location, race, or socioeconomic status.
Utilitarian Ethics (Maximizing Fairness & Inclusivity) – AI should benefit all groups equally, not just selected demographics.
IEEE & ACM AI Ethics Standards – Encourage transparent, unbiased data selection to prevent discrimination.
Relevant Ethical References in Technology:Thus, the correct answers are C. Proxy discrimination and D. Small sample size, as both contribute to bias in AI models.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit