If training and testing sets are not separated, the model evaluates itself on the same data it was trained on, creating a false sense of accuracy. The result isoverfitting(option A): the model learns the training data too well and fails to generalize to new data.
AAIA emphasizes that proper data splitting is foundational to machine learning evaluation. Overfitting undermines real-world performance, creates untrustworthy predictions, and hides bias or errors.
Model drift (B) occurs after deployment. Hallucinations (C) relate more to generative models. Underfitting (D) occurs when the model is too simple, not from lack of dataset separation.
Thus, overfitting is the direct and greatest risk when training and testing sets are not segregated.
[References:, AAIA Domain 2: Testing Techniques and Model Evaluation Standards., AAIA Domain 1: Fundamentals of ML model training., ]
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit