Option C best satisfies the requirement for rapid, correlated detection of model-related performance degradation. Amazon CloudWatch Application Insights provides automated observability across application components running on Amazon EC2, identifying abnormal behavior patterns without requiring extensive manual configuration.
Using custom metrics for recommendation quality, token usage, and response latency allows the company to directly monitor FM behavior, not just infrastructure health. Applying dimensions such as request type and user segment enables fine-grained correlation between performance issues and specific customer interactions or workloads.
CloudWatch anomaly detection is critical because it establishes dynamic baselines from historical data and detects deviations automatically. This enables alerts to be generated within minutes when FM behavior changes unexpectedly, satisfying the 10-minute alerting requirement without static thresholds that can miss subtle degradations.
CloudWatch Logs Insights complements metrics by enabling rapid analysis of log patterns, error messages, or unusual request flows associated with degraded recommendations. Because all data remains within CloudWatch, correlation between metrics, logs, and alerts is straightforward and operationally efficient.
Option A focuses on infrastructure metrics and lacks behavioral baselining. Option B provides tracing but not automated anomaly detection. Option D adds significant operational overhead and ingestion complexity for a use case already well supported by CloudWatch-native features.
Therefore, Option C delivers the most effective, scalable, and low-overhead observability solution for detecting FM-related performance deviations.
Submit